pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Driss Guessous	71369051ee	[Nested Tensor] fix from_padded bug (#84217 ) Fixes #84082 Explained in the issue that the problem was arising from grad being not contiguous and the fast kernel not handiling this case gracefully. The other thing I can do is add a contiguous call to `d144594512/aten/src/ATen/native/nested/cuda/NestedTensorTransformerFunctions.cpp (L45)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84217 Approved by: https://github.com/albanD	2022-08-30 03:48:11 +00:00
Driss Guessous	2436cf8aa8	[Nested Tensor] detach (#84078 ) ## Summary Add detach op for nested tensors. Nested tensors are not part of the composite explicit dispatch key set and therefore need to be added manually. The Detach test is failing only for the dtype=torch.float32, torch.float16 and device=cuda. The chain of ops that called are sum.backward() -> from_padded() -> unbind(). This populates the grad for a and b. Does this potentially indicated that cuda implementation for one of these ops, likely from_padded() is incorrect? Pull Request resolved: https://github.com/pytorch/pytorch/pull/84078 Approved by: https://github.com/albanD	2022-08-29 09:12:26 +00:00
PyTorch MergeBot	f4f54c7ce1	Revert "[Nested Tensor] detach (#84078 )" This reverts commit `092fe71f33`. Reverted https://github.com/pytorch/pytorch/pull/84078 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-08-28 15:30:21 +00:00
Driss Guessous	092fe71f33	[Nested Tensor] detach (#84078 ) ## Summary Add detach op for nested tensors. Nested tensors are not part of the composite explicit dispatch key set and therefore need to be added manually. The Detach test is failing only for the dtype=torch.float32, torch.float16 and device=cuda. The chain of ops that called are sum.backward() -> from_padded() -> unbind(). This populates the grad for a and b. Does this potentially indicated that cuda implementation for one of these ops, likely from_padded() is incorrect? Pull Request resolved: https://github.com/pytorch/pytorch/pull/84078 Approved by: https://github.com/albanD	2022-08-27 03:00:55 +00:00
Yifan Shen	b3c99bef0c	Support nested dropout autograd (#83338 ) When the initial version came out, `NestedTensor` was not included in the `CompositeImplicitAutograd` key set, so we had to register dropout_nested to dropout and make it forward-only. Now is the time to improve it! This pr removes dropout_nested; instead native_dropout_nested is implemented along with native_dropout_backward_nested. Side change: remove dropout__nested since @cpuhrsch suggested to leave out nested in-place ops for now Pull Request resolved: https://github.com/pytorch/pytorch/pull/83338 Approved by: https://github.com/jbschlosser	2022-08-18 00:49:29 +00:00
Mikayla Gawarecki	bd0ad7a84f	Add backward support for rudimentary NestedTensor.sum(dim) (#82625 ) Per offline discussion, this will be updated to use expand once expand semantics for nested tensor have been fleshed out. Next steps will be to add support for other features for forward sum mentioned on #82387 and likewise update the backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/82625 Approved by: https://github.com/albanD	2022-08-17 18:12:00 +00:00
Driss Guessous	4b597019b7	[Nested Tensor] Created Nested Tensor to Nested Tensor Views (#82658 ) # Summary This is PR is pulling out all the changes from #81838 specific to properly creating nested_tensor views. I will update this comment with a design doc once that has been made. This should enable proper creation of NestedTensor views, two nested_tensors sharing the same buffer_ but with different NestedTensor meta data. The function `create_nested_tensor_view` is a helper function for creating a new nested tensor whose storage aliases the base causing the underlying storage to be shared - and is therefore a view. This function by itself is not differentiable and therefore autograd does not track its uses. If a nested tensor function implementation uses this helper in its implementation the aten_op must meet two requirements: - The function must return a view of the input - The function must be explicit and defines its backward ## Testing A bug was found when creating a base tensor out of inference mode and then creating a view in inference mode. This test has been aded to this PR in order to show the effect of the change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82658 Approved by: https://github.com/albanD	2022-08-16 20:22:21 +00:00
Driss Guessous	c5c0dd9b62	Update shallow_copy_and_detach for nested tensor impls (#83002 ) # Summary This change fixes a bug that was encountered when trying to add more backward formulas for nested tensor ops. If a derivative is defined that stores the "result" for use in the backward the output of the forward op is saved using: ``` if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } ``` SavedVariable calls a series of functions which in turn calls shallow_copy_and_detach and when `c179597753/c10/core/TensorImpl.cpp (L533)` is hit this calls sizes_custom() which is not implemented and errors. I also noticed that since the storage format is different for nested_tensor not `storage_ ` but instead two tensors that the we should actually be calling the NestedTensorImpl constructor. This PR overrides shallow_copy_and_detach from the derived class and ensures that shallow copy works correctly. ## Update - Added the softmax derivative in this PR because that is a direct use case that was blocked by not having shallow_copy_and_detach work correctly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83002 Approved by: https://github.com/albanD	2022-08-10 20:34:46 +00:00
Driss Guessous	e816644495	Add nested tensor contiguous (#82147 ) ### Description <!-- What did you change and why was it needed? --> The nested_tensor impl for `contiguous` was currently disabled. Prior to the work on nested_tensor transpose. Only contiguous nested tensors could be created from python. However now is possible to create nested tensors that are non contiguous. This pr links up the existing function used at the c++ level to the python function. ### Tests Updated Test in `test/test_nestedtensor.py` ### Notes The inference mode had to be removed for this test. This is because the func `.contiguous` is a composite implicit function. Currently this does not work in inference mode. However: https://github.com/pytorch/pytorch/pull/81838 should fix that issue. ### Why When writing kernels in Triton for nested tensors I exposed a helper function that returned the "Buffer" tensor to python. Now contiguity can be checked before running any triton kernel. Also a good follow up would be making `nt.contiguous` on non contiguous nested tensors return a contiguous nested tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82147 Approved by: https://github.com/jbschlosser	2022-08-09 01:51:37 +00:00
Joel Benjamin Schlosser	6ca95547ac	Initial private SDP interface and naive composite impl (#81956 ) Adds an initial private API version of the SDP interface. Signature: ``` _scaled_dot_product_attention(Tensor query, Tensor key, Tensor value, Tensor? attn_mask=None, float dropout_p=0.0, bool need_attn_weights=True, bool is_causal=False) -> (Tensor, Tensor) ``` Returns a tuple of `(output, attn_weights)`. Note the following: * `need_attn_weights`: flag indicating that attention weights should be computed. This is useful to toggle off for flash attention as it does not materialize the weights by default, making it more expensive to return them. * Boolean attention mask support only; `True` values within `attn_mask` indicate that the element should take part in attention (notably, this is reverse of MHA, which uses `True` to mask out values). Mask is optional. * `is_causal`: Temporary flag indicating whether to use a causal attention weighting. If this is set to `True`, it takes precedent over any value passed in for `attn_mask`. Longer term, the `is_causal` flagging can be subsumed into the `attn_mask` arg via tensor subclassing (see e.g. [CausalTensor](https://github.com/facebookresearch/xformers/blob/sparse_cleanup/xformers/sparse/causal_tensor.py) in xFormers). * Testing is currently done via reference with the existing Python impl of `F._scaled_dot_product_attention`. * This PR does not yet drop-in the new SDP anywhere. A future PR can hook it up in BT or MHA. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81956 Approved by: https://github.com/drisspg, https://github.com/erichan1	2022-08-01 22:26:18 +00:00
YifanShenSZ	4bb7e148c4	add nested tensor matmul support (#81957 ) There was a discussion on whether letting nested tensor `reshape` support collapsing and splitting dimension 0. The conclusion was to make reshape simple, so we need a tweaked `matmul`, which only supports 3+ dimension nonbroadcast case, i.e. a generalized `bmm`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81957 Approved by: https://github.com/jbschlosser	2022-07-30 22:35:09 +00:00
YifanShenSZ	5f9939f65e	Introduce discontinuity to nested tensor (#80981 ) Nested tensor used to assume the buffer memory to be contiguous. However, some operations can break that assumption: * reshape * transpose * slice To be able to access underlying tensors from discontinuous buffer, we need 3 metadata: * sizes of each tensor (`nested_size_tensor_`) * strides of each tensor (`nested_stride_tensor_`) * offset of each tensor (`offsets_`) so we access each tensor by `buffer.as_strided(size, stride, offset)` This pull request introduces the offsets metadata, then added reshape and transpose so that we can create discontinuous cases for testing. Unbind, select, dropout, softmax, bmm are refactored to provide tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80981 Approved by: https://github.com/jbschlosser	2022-07-30 04:08:30 +00:00
Mikayla Gawarecki	89c0123ba0	Add rudimentary NestedTensor.sum(dim) (#82387 ) A first step towards adding dimension-wise reductions to NestedTensor, - Assumes tensors in the nested tensor as well as the buffer of the nested tensor are contiguous - Always enforces `keepdim=True` - Only supports reduction across the last dimension - No support for acctype (`dtype` argument) - No autograd support - CPU only Next steps would be to add support for the above. For now this basic support is for prototyping to make sure `NestedTensor` can be used as an API for segment reductions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82387 Approved by: https://github.com/jbschlosser	2022-07-28 22:45:22 +00:00
PyTorch MergeBot	26776d628c	Revert "Initial private SDP interface and naive composite impl (#81956 )" This reverts commit `f15c5bf133`. Reverted https://github.com/pytorch/pytorch/pull/81956 on behalf of https://github.com/janeyx99 due to broke all configs on test_scaled_dot_product_attention (__main__.TestNestedTensorAutograd) `f15c5bf133`	2022-07-27 18:36:54 +00:00
Joel Benjamin Schlosser	f15c5bf133	Initial private SDP interface and naive composite impl (#81956 ) Adds an initial private API version of the SDP interface. Signature: ``` _scaled_dot_product_attention(Tensor query, Tensor key, Tensor value, Tensor? attn_mask=None, float dropout_p=0.0, bool need_attn_weights=True, bool is_causal=False) -> (Tensor, Tensor) ``` Returns a tuple of `(output, attn_weights)`. Note the following: * `need_attn_weights`: flag indicating that attention weights should be computed. This is useful to toggle off for flash attention as it does not materialize the weights by default, making it more expensive to return them. * Boolean attention mask support only; `True` values within `attn_mask` indicate that the element should take part in attention (notably, this is reverse of MHA, which uses `True` to mask out values). Mask is optional. * `is_causal`: Temporary flag indicating whether to use a causal attention weighting. If this is set to `True`, it takes precedent over any value passed in for `attn_mask`. Longer term, the `is_causal` flagging can be subsumed into the `attn_mask` arg via tensor subclassing (see e.g. [CausalTensor](https://github.com/facebookresearch/xformers/blob/sparse_cleanup/xformers/sparse/causal_tensor.py) in xFormers). * Testing is currently done via reference with the existing Python impl of `F._scaled_dot_product_attention`. * This PR does not yet drop-in the new SDP anywhere. A future PR can hook it up in BT or MHA. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81956 Approved by: https://github.com/drisspg, https://github.com/erichan1	2022-07-27 15:41:45 +00:00
PyTorch MergeBot	500be5998d	Revert "Introduce discontinuity to nested tensor (#80981 )" This reverts commit `b492f7c485`. Reverted https://github.com/pytorch/pytorch/pull/80981 on behalf of https://github.com/osalpekar due to This was reverted internally in D38142790, due to causing TorchScript inference failures	2022-07-26 21:40:42 +00:00
PyTorch MergeBot	0b0dbc59e6	Revert "Update shallow_copy_and_detach for nested tensor impls to enable nested tensor softmax backward (#81838 )" This reverts commit `6697f1e467`. Reverted https://github.com/pytorch/pytorch/pull/81838 on behalf of https://github.com/osalpekar due to Reverting this in order to revert https://github.com/pytorch/pytorch/pull/80981 cleanly. That diff caused GPU Inference breakage internally	2022-07-26 21:34:10 +00:00
PyTorch MergeBot	6c10a598ca	Revert "add nested tensor matmul support (#81957 )" This reverts commit `7bdafed4f1`. Reverted https://github.com/pytorch/pytorch/pull/81957 on behalf of https://github.com/osalpekar due to Reverting this in order to revert https://github.com/pytorch/pytorch/pull/80981 cleanly. That diff caused GPU Inference breakage internally	2022-07-26 21:10:28 +00:00
YifanShenSZ	7bdafed4f1	add nested tensor matmul support (#81957 ) There was a discussion on whether letting nested tensor `reshape` support collapsing and splitting dimension 0. The conclusion was to make reshape simple, so we need a tweaked `matmul`, which only supports 3+ dimension nonbroadcast case, i.e. a generalized `bmm`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81957 Approved by: https://github.com/jbschlosser	2022-07-26 16:58:42 +00:00
Driss Guessous	6697f1e467	Update shallow_copy_and_detach for nested tensor impls to enable nested tensor softmax backward (#81838 ) # Summary This change fixes a bug that was encountered when trying to add more backward formulas for nested tensor ops. If a derivative is defined that stores the "result" for use in the backward the output of the forward op is saved using: ``` if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } ``` SavedVariable calls a series of functions which in turn calls shallow_copy_and_detach and when `c179597753/c10/core/TensorImpl.cpp (L533)` is hit this calls sizes_custom() which is not implemented and errors. I also noticed that since the storage format is different for nested_tensor not `storage_ ` but instead two tensors that the we should actually be calling the NestedTensorImpl constructor. This PR overrides shallow_copy_and_detach from the derived class and ensures that shallow copy works correctly. ## Update - Added the softmax derivative in this PR because that is a direct use case that was blocked by not having shallow_copy_and_detach work correctly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81838 Approved by: https://github.com/soulitzer	2022-07-25 20:04:40 +00:00
Yifan Shen	b492f7c485	Introduce discontinuity to nested tensor (#80981 ) Nested tensor used to assume the buffer memory to be contiguous. However, some operations can break that assumption: * reshape * transpose * slice To be able to access underlying tensors from discontinuous buffer, we need 3 metadata: * sizes of each tensor (`nested_size_tensor_`) * strides of each tensor (`nested_stride_tensor_`) * offset of each tensor (`offsets_`) so we access each tensor by `buffer.as_strided(size, stride, offset)` This pull request introduces the offsets metadata, then added reshape and transpose so that we can create discontinuous cases for testing. Unbind, select, dropout, softmax, bmm are refactored to provide tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80981 Approved by: https://github.com/jbschlosser	2022-07-21 17:17:25 +00:00
Driss Guessous	fca1523604	implement numel and tests for nested tensor (#80424 ) Add numel implementation for Nested Tensor. Currently the construction of nested size and nested_strides assume contiguous. This implementation was based off of the safe_compute_numel(). Having a TORCH_CHECK in a for loop kinda feels bad but I don't really know how performant numel needs to be. Since nested size is stored as a tensor: `nested_size_tensor().cumprod(dim=1).sum(dim=0)[1].item() ` Would also get the job done. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80424 Approved by: https://github.com/cpuhrsch	2022-06-28 18:02:44 +00:00
drisspg	2a09e95169	Register nested tensor linear kernel (#80397 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80397 Approved by: https://github.com/soulitzer	2022-06-28 06:23:26 +00:00
Christian Puhrsch	2258db5da3	TensorImpl:::size_custom to support NestedTensor.size (#80236 ) This allows subclasses such as NestedTensorImpl to provide special behavior for `int64_t size(int64_t d)` that'll also be accessible by our Python frontend. It follows the same pattern as sizes_custom. Currently getting CI before asking for a review. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80236 Approved by: https://github.com/ezyang	2022-06-27 17:07:42 +00:00
Yifan Shen	09f79e94ac	support nested_tensor * scalar (#80284 ) In transformer, the scale step in attention has a `nested_tensor / scalar` operation. There are two ways to support that: 1. directly support `nested_tensor / scalar`: * pro: straightforward, good UX * con: is dispatching `mul(nested tensor, regular tensor)` a good practice? 2. let user manually convert `scalar` to `nested_scalar = torch.nested_tensor([broadcast_scalar])` * pro: dispatcher only has to deal with `mul(nested tensor, nested tensor)` * con: confusing manual conversions, bad UX Pull Request resolved: https://github.com/pytorch/pytorch/pull/80284 Approved by: https://github.com/cpuhrsch	2022-06-27 14:15:05 +00:00
Yifan Shen	fc0faa2cf6	Support nested_tensor.bmm (#80224 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80224 Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser	2022-06-25 03:19:46 +00:00
Yifan Shen	54a1cc5246	Support softmax(nested tensor) (#80179 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80179 Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser	2022-06-24 14:50:24 +00:00
Yifan Shen	f749f86fee	Add nested tensor metadata nested_stride then use it in unbind, select (#79831 ) 2 reasons to add metadata `nested_stride`: 1. it will be used later in `reshape` and `transpose` 2. it reduces the computation to get offsets and shapes necessary in `unbind`-like codes, which will be used again and again in nested tensor operations `unbind` and `select` are refactored to make use of `nested_stride` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79831 Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser	2022-06-23 20:24:50 +00:00
Driss Guessous	a098937c20	Add factory function derivatives (#79872 ) Adding derivatives for factory functions, this issue is used for tracking: #79044 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79872 Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer	2022-06-21 00:53:11 +00:00
Edward Z. Yang	f7ee061638	Wconstab/reland pysymint (#79795 ) rebased https://github.com/pytorch/pytorch/pull/79617/ to see if issues are reproducible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79795 Approved by: https://github.com/malfet	2022-06-20 22:55:06 +00:00
Yifan Shen	1b25aa6786	Support dropout(nested tensor) (#79318 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79318 Approved by: https://github.com/jbschlosser	2022-06-17 18:41:54 +00:00
PyTorch MergeBot	8a7a5def1d	Revert "Support dropout(nested tensor) (#79318 )" This reverts commit `1211ab679c`. Reverted https://github.com/pytorch/pytorch/pull/79318 on behalf of https://github.com/janeyx99 due to Broke dropout tests on trunk, also errors on PR	2022-06-17 04:56:29 +00:00
Yifan Shen	1211ab679c	Support dropout(nested tensor) (#79318 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79318 Approved by: https://github.com/jbschlosser	2022-06-17 00:46:07 +00:00
drisspg	f9656817df	Add nested tensor support to autograd (#79446 ) The issue that is tracking this work is: #79447 This is one in a series of PRs to add autograd support for nested tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79446 Approved by: https://github.com/soulitzer	2022-06-16 21:09:17 +00:00
PyTorch MergeBot	44436947bc	Revert "Reland PySymInt (#79617 )" This reverts commit `8ef6356f26`. Reverted https://github.com/pytorch/pytorch/pull/79617 on behalf of https://github.com/zengk95 due to this is breaking periodic jobs (and maybe pull) on trunk	2022-06-16 19:40:27 +00:00
Nikolay Korovaiko	8ef6356f26	Reland PySymInt (#79617 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/79617 Approved by: https://github.com/Chillee	2022-06-16 04:18:06 +00:00
PyTorch MergeBot	b8db0a0475	Revert "Python Bindings for SymInts (#78135 )" This reverts commit `d332724071`. Reverted https://github.com/pytorch/pytorch/pull/78135 on behalf of https://github.com/ezyang due to broke torchvision tests	2022-06-15 13:52:14 +00:00
Nikolay Korovaiko	d332724071	Python Bindings for SymInts (#78135 ) This PR adds support for `SymInt`s in python. Namely, * `THPVariable_size` now returns `sym_sizes()` * python arg parser is modified to parse PyObjects into ints and `SymbolicIntNode`s * pybind11 bindings for `SymbolicIntNode` are added, so size expressions can be traced * a large number of tests added to demonstrate how to implement python symints. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78135 Approved by: https://github.com/ezyang	2022-06-14 02:17:59 +00:00
YifanShenSZ	6ad51c9422	Support indexing of the underlying tensors for nested tensors (#78934 ) Fixes #76843 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78934 Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser	2022-06-08 21:05:04 +00:00
Christian Puhrsch	c81c0b6d42	Support clone for NestedTensor (#78826 ) Potentially fixes #78754. Adding coverage for clone to NestedTensor surely won't hurt. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78826 Approved by: https://github.com/jbschlosser	2022-06-06 19:30:58 +00:00
Michael Anderson	76abbbe317	Adding output_size to to_padded_tensor (#76640 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76640 - Adding output_size argument to to_padded_tensor - Modified add_padding_kernelLauncher and kernels to iterate over padded tensor batch size instead of nested tensor batch size - No fast path for CPU version Test Plan: buck test mode/dev-nosan //caffe2/test:nested Performance test using N1763981: {F728168808} Reviewed By: cpuhrsch Differential Revision: D36056902 fbshipit-source-id: d6df2939d6649128a7f43a2ef32d227870a8e583 (cherry picked from commit 09465f36f09d4d74c9b3303981d8cce0c7c1092a)	2022-05-03 18:22:51 +00:00
Michael Anderson	4d6b145bb2	Nested Tensor Elementwise (#76470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76470 Adding elementwise add and mul for nested tensor Test Plan: ``` buck test mode/dev-nosan //caffe2/test:nested ``` Perf testing with N1763981: {F727236009} Reviewed By: cpuhrsch Differential Revision: D35773182 fbshipit-source-id: f1c0ac40716b616884de3db80cdc7baf141cde7f (cherry picked from commit 3b3c8fd7b2922556dc1a6afc0cb29cf7da7d0fc1)	2022-05-03 18:22:51 +00:00
Joel Benjamin Schlosser	fd3cfb683a	Fix string repr for nested tensors & run nested tensor tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/76541 Approved by: https://github.com/ishaan-mehta, https://github.com/cpuhrsch	2022-04-28 20:51:30 +00:00
Christian Puhrsch	ea1901693e	Port to_padded_tensor CUDA kernel from pytorch/nestedtensor This PR adds a custom CUDA kernel to pad NestedTensors between dimension 2 and 4. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76157 Approved by: https://github.com/ngimel	2022-04-27 20:41:40 +00:00
Scott Wolchok	0a5e788ab2	[PyTorch] Add NestedTensorCPU and NestedTensorCUDA dispatch keys (#75808 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75808 Just as it is often difficult to write a single kernel that can handle both CPU and CUDA, so can it be difficult to do the same for NestedTensor. ghstack-source-id: 154171542 (Note: this ignores all push blocking failures!) Test Plan: CI? Reviewed By: bdhirsh Differential Revision: D35603836 fbshipit-source-id: fb0ebb19d34531ed96ce176aca325f8e2b5f90e6 (cherry picked from commit 0bcd753f93c04256c1b745f84a74ecccf0dceef5)	2022-04-19 18:12:12 +00:00
Scott Wolchok	97c993ca7a	[PyTorch] Add NestedTensor support functions for transformers Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491 Here are the NestedTensor kernels we'll need for the improved transformer implementation. Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)! Approved by: https://github.com/cpuhrsch	2022-04-14 16:30:23 +00:00
Scott Wolchok	70cab5ebb1	[PyTorch] NestedTensor kernels for {r,g}elu{,_} Pull Request resolved: https://github.com/pytorch/pytorch/pull/75370 These are simple element-wise ops it's convenient to be able to use with NestedTensor. Differential Revision: [D35448205](https://our.internmc.facebook.com/intern/diff/D35448205/) Approved by: https://github.com/ngimel	2022-04-08 17:54:09 +00:00
Scott Wolchok	90be8fa279	[PyTorch] Make TensorImpl::sizes() customizable and disable it for NestedTensorImpl (#73817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73817 NestedTensorImpl doesn't have sizes(). Silently getting wrong results back from it is not conducive to efficient software development. Make it throw while allowing sizes() to be inlined in the common case anyway, just like is_contiguous(). Thanks ezyang for the reminder that we could do this. ghstack-source-id: 151302903 Test Plan: Updated test_nestedtensor.py Reviewed By: ezyang Differential Revision: D34660829 fbshipit-source-id: 1289f21127d6a8359893f9174f3c430a290f2c7f (cherry picked from commit 7098b9fcfbd25a03bac19e1148426ff073810edd)	2022-03-15 19:24:57 +00:00
Christian Puhrsch	484c0de670	Minimal NestedTensor (#72881 ) Summary: This PR adds a minimal version of a NestedTensor. It introduces the general harness future development can be built around. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72881 Reviewed By: albanD Differential Revision: D34259177 Pulled By: cpuhrsch fbshipit-source-id: 0245c36f603424e20f3b09651043c207f526d760 (cherry picked from commit 10764e8d427f29b364567e4cbc86ed73c3933158)	2022-03-02 16:31:51 +00:00

49 Commits