pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
cyy	f27e09de04	Cleanup Windows warning suppression in CMake and fix some warnings in the source code (#94927 ) This PR do two things: 1. It moves some Windows warning suppression from various CMake files into the main CMakeList.txt, following the conventions of gcc and clang. 2. It fixes some Windows warnings in the source code. Most importantly, it fixes lots of dll warnings by adjusting C10_API to TORCH_API or TORCH_PYTHON_API. There are still some dll warnings because some TORCH_API functions are actually built as part of libtorch_python Pull Request resolved: https://github.com/pytorch/pytorch/pull/94927 Approved by: https://github.com/malfet	2023-02-27 19:22:20 +00:00
Ramin Azarmehr	bdd8f518d7	[MPS] Add Python Module Bindings for the MPS backend (#94417 ) - This PR is a prerequisite for the upcoming Memory Leak Detection PR. - Enable global manual seeding via `torch.manual_seed()` + test case - Add `torch.mps.synchronize()` to wait for MPS stream to finish + test case - Enable the following python interfaces for MPS: `torch.mps.[get_rng_state(), set_rng_state(), synchronize(), manual_seed(), seed()]` - Added some test cases in test_mps.py - Added `mps.rst` to document the `torch.mps` module. - Fixed the failure with `test_public_bindings.py` Description of new files added: - `torch/csrc/mps/Module.cpp`: implements `torch._C` module functions for `torch.mps` and `torch.backends.mps`. - `torch/mps/__init__.py`: implements Python bindings for `torch.mps` module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94417 Approved by: https://github.com/albanD	2023-02-12 21:22:30 +00:00
PyTorch MergeBot	4fe365774a	Revert "[MPS] Add Python Module Bindings for the MPS backend (#94417 )" This reverts commit `beb4f5bf39`. Reverted https://github.com/pytorch/pytorch/pull/94417 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it seems to break MacOS test in trunk `bae397ec63`	2023-02-11 05:24:45 +00:00
Ramin Azarmehr	beb4f5bf39	[MPS] Add Python Module Bindings for the MPS backend (#94417 ) - This PR is a prerequisite for the upcoming Memory Leak Detection PR. - Enable global manual seeding via `torch.manual_seed()` + test case - Add `torch.mps.synchronize()` to wait for MPS stream to finish + test case - Enable the following python interfaces for MPS: `torch.mps.[get_rng_state(), set_rng_state(), synchronize(), manual_seed(), seed()]` - Added some test cases in test_mps.py - Added `mps.rst` to document the `torch.mps` module. - Fixed the failure with `test_public_bindings.py` Description of new files added: - `torch/csrc/mps/Module.cpp`: implements `torch._C` module functions for `torch.mps` and `torch.backends.mps`. - `torch/mps/__init__.py`: implements Python bindings for `torch.mps` module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94417 Approved by: https://github.com/albanD	2023-02-10 23:18:41 +00:00
Elias Ellison	70f4b3551c	Add Hook to store arbitrary python objects that are copied over in tls (#89169 ) For the cudagraphs implementation, we would like to reuse objects that are defined in python across the forward and backward. The backward is run in a different thread, so to handle this we add an api for copying over arbitrary python objects in pytorch's thread local state, in the same way that C++ objects are copied over currently. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89169 Approved by: https://github.com/albanD	2023-01-24 05:24:57 +00:00
Pearu Peterson	b3e4f5029b	Add check-sparse-tensor-invariants flag to Context - 2nd try. (#92094 ) This PR is a copy of https://github.com/pytorch/pytorch/pull/90849 that merge was reverted. The PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI: `torch.sparse.check_sparse_tensor_invariants` class provides different ways to enable/disable the invariant checking. `torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden. The PR fixes https://github.com/pytorch/pytorch/issues/90833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92094 Approved by: https://github.com/cpuhrsch	2023-01-13 14:50:33 +00:00
PyTorch MergeBot	c7a22bb7c7	Revert "Add check-sparse-tensor-invariants flag to Context. (#90849 )" This reverts commit `b9a035c1c5`. Reverted https://github.com/pytorch/pytorch/pull/90849 on behalf of https://github.com/DanilBaibak due to Break internal build	2023-01-12 09:58:16 +00:00
samdow	b8252e07c7	[Reland] add DisableTorchFunction that matches DisableTorchDispatch (#88219 ) (#92012 ) Reland of #88219 Closes #87990. This implements a new disable guard that matches DisableTorchDispatch (disables all subclasses and modes) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92012 Approved by: https://github.com/albanD	2023-01-12 01:27:47 +00:00
Pearu Peterson	b9a035c1c5	Add check-sparse-tensor-invariants flag to Context. (#90849 ) This PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI: - `torch.enable_check_sparse_tensor_invariants` and `torch.is_check_sparse_tensor_invariants_enabled` functions to globally enable/disable the invariant checks and to retrieve the state of the feature, respectively - `torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden. The PR also fixes https://github.com/pytorch/pytorch/issues/90833 # Main issue The following content is outdated after merging the PRs in this ghstack but kept for the record. The importance of this feature is that when enabling the invariants checks by default, say, via <details> ``` $ git diff diff --git a/torch/__init__.py b/torch/__init__.py index c8543057c7..19a91d0482 100644 --- a/torch/__init__.py +++ b/torch/__init__.py @@ -1239,3 +1239,8 @@ if 'TORCH_CUDA_SANITIZER' in os.environ: # Populate magic methods on SymInt and SymFloat import torch.fx.experimental.symbolic_shapes + +# temporarily enable sparse tensor arguments validation in unsafe +# constructors: + +torch._C._set_check_sparse_tensor_invariants(True) ``` </details> a massive number of test failures/errors occur in test_sparse_csr.py tests: ``` $ pytest -sv test/test_sparse_csr.py <snip> ==== 4293 failed, 1557 passed, 237 skipped, 2744 errors in 69.71s (0:01:09) ==== ``` that means that we are silently constructing sparse compressed tensors that do not satisfy the sparse tensor invariants. In particular, the following errors are raised: ``` AssertionError: "resize_as_sparse_compressed_tensor_: self and src must have the same layout" does not match "expected values to be a strided and contiguous tensor" RuntimeError: CUDA error: device-side assert triggered RuntimeError: `col_indices[..., crow_indices[..., i - 1]:crow_indices[..., i]] for all i = 1, ..., nrows are sorted and distinct along the last dimension values` is not satisfied. RuntimeError: expected col_indices to be a strided and contiguous tensor RuntimeError: expected row_indices to be a strided and contiguous tensor RuntimeError: expected values to be a strided and contiguous tensor RuntimeError: for_each: failed to synchronize: cudaErrorAssert: device-side assert triggered RuntimeError: tensor dimensionality must be sum of batch, base, and dense dimensionalities (=0 + 2 + 0) but got 3 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/90849 Approved by: https://github.com/amjames, https://github.com/cpuhrsch	2023-01-11 01:05:14 +00:00
Samantha Andow	a7749ae177	[reland] rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218 ) (#89221 ) Summary: First half of #87990. This doesn't change any of the behavior and is just a rename #88218 got reverted for internal breakages. This is the reland of started from internal Differential Revision: D41268423 LaMa Project: L1098534 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89221 Approved by: https://github.com/meliy-meyada, https://github.com/zou3519	2023-01-04 18:32:49 +00:00
Eddie Yan	8b617f813d	[cuBLAS] Add an option to disable reduced precision reductions for BF16 GEMM (#89172 ) Essentially the same change as #67946, except that the default is to disallow reduced precision reductions in `BFloat16` GEMMs (for now). If performance is severely regressed, we can change the default, but this option appears to be necessary to pass some `addmm` `BFloat16` tests on H100. CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/89172 Approved by: https://github.com/ngimel	2022-12-21 18:58:28 +00:00
albanD	28ceccec21	cleanup old python_compat code (#91162 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91162 Approved by: https://github.com/ezyang	2022-12-20 18:13:19 +00:00
albanD	0eb45d546c	Bind autograd current Node for debugging purposes (#90867 ) This allows to know at any point during the backward pass what is running and where the Node currently running was created at: ```python import torch from torch.utils._python_dispatch import TorchDispatchMode from torch.autograd import detect_anomaly class MyMode(TorchDispatchMode): def __torch_dispatch__(self, func, types, args, kwargs=None): node = torch._C._current_autograd_node() print(f"Running {func} from within {node}") if node is not None: print("The Node was created at:") print("\n ".join(node.metadata["traceback_"])) return func(args, *kwargs or {}) with MyMode(), detect_anomaly(): print("FW") a = torch.rand(10, requires_grad=True) b = a.mul(2) b = b.div(3) b = b.sum() print("BW") b.backward() ``` Gives ``` $ python foo.py foo.py:15: UserWarning: Anomaly Detection has been enabled. This mode will increase the runtime and should only be enabled for debugging. with MyMode(), detect_anomaly(): FW Running aten.rand.default from within None Running aten.mul.Tensor from within None Running aten.div.Tensor from within None Running aten.sum.default from within None BW Running aten.ones_like.default from within None Running aten.expand.default from within <SumBackward0 object at 0x7fa40c0c6dc0> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten.isnan.default from within <SumBackward0 object at 0x7fa40c0c6500> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten.any.default from within <SumBackward0 object at 0x7fa32b23a780> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten._local_scalar_dense.default from within <SumBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten.div.Tensor from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten.isnan.default from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten.any.default from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten._local_scalar_dense.default from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten.mul.Tensor from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.isnan.default from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.any.default from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten._local_scalar_dense.default from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.detach.default from within <AccumulateGrad object at 0x7fa40c0c9730> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.detach.default from within <AccumulateGrad object at 0x7fa40c0c94b0> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/90867 Approved by: https://github.com/soulitzer	2022-12-20 13:41:43 +00:00
Nikita Shulga	3859aace20	[MPS] Skip tests broken on Ventura (#90843 ) Also add `torch.backends.mps.is_macos13_or_newer` See https://github.com/pytorch/pytorch/issues/85758 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90843 Approved by: https://github.com/kulinseth, https://github.com/albanD	2022-12-14 19:51:00 +00:00
Richard Zou	4b1053497c	[vmap] Prepend "legacy" to files for old vmap implementation (#90324 ) We have an older torch.vmap implementation. It is no longer supported. It still needs to exist somewhere for the sake of BC with torch.autograd.functional. This PR makes it clear what files are meant for implementing the old vmap implementation. I've seen a couple of PRs recently adding support for the old vmap implementation, so this will lessen the confusion. Test Plan: - CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/90324 Approved by: https://github.com/samdow	2022-12-07 18:46:15 +00:00
Edward Z. Yang	4908a12542	Reland "SymIntify convolution backend calculation (#89069 )"" (#89142 ) This reverts commit `90db86be10`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89142 Approved by: https://github.com/albanD, https://github.com/malfet	2022-11-16 21:41:47 +00:00
PyTorch MergeBot	90db86be10	Revert "SymIntify convolution backend calculation (#89069 )" This reverts commit `09ed8b67e2`. Reverted https://github.com/pytorch/pytorch/pull/89069 on behalf of https://github.com/DanilBaibak due to breaking internal builds	2022-11-16 16:36:27 +00:00
Edward Z. Yang	09ed8b67e2	SymIntify convolution backend calculation (#89069 ) We will need this to implement a convolution meta function that is SymInt aware. I use templates so that regular convolution code is not affected by the change. No tests for symbolic ints directly; that will come in a subsequent PR which also needs to refactor fake tensors. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89069 Approved by: https://github.com/SherlockNoMad	2022-11-16 14:02:43 +00:00
PyTorch MergeBot	ba4d5aae06	Revert "rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218 )" This reverts commit `7f28be10e5`. Reverted https://github.com/pytorch/pytorch/pull/88218 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41211901	2022-11-11 19:13:05 +00:00
PyTorch MergeBot	4e5d7afe84	Revert "add DisableTorchFunction that matches DisableTorchDispatch (#88219 )" This reverts commit `c0ecce15b5`. Reverted https://github.com/pytorch/pytorch/pull/88219 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41211901	2022-11-11 19:08:30 +00:00
samdow	c0ecce15b5	add DisableTorchFunction that matches DisableTorchDispatch (#88219 ) Closes #87990. This implements a new disable guard that matches DisableTorchDispatch (disables all subclasses and modes) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88219 Approved by: https://github.com/ezyang	2022-11-10 14:51:13 +00:00
samdow	7f28be10e5	rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218 ) First half of #87990. This doesn't change any of the behavior and is just a rename Pull Request resolved: https://github.com/pytorch/pytorch/pull/88218 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-11-10 14:51:13 +00:00
kshitij12345	eb9b156019	[fix] MathBits: serialization (#88182 ) Fixes #81690 TODO: * [x] C++ Unpickler Fix (locally tested pickled in Python and unpickled in C++) * [x] C++ Pickler Fix (locally tested pickled in C++ and unpickled in Python) * [x] Do quant_tensor, sparse_tensor, etc require similar changes? (Sparse and Quant don't need this) * [x] Add Comments * [x] How to make sure C++ and Python are in sync? (Functions in `pickler.h` help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.) Notes: Quant Tensor don't support complex dtypes and for float they segfault with `_neg_view` : https://github.com/pytorch/pytorch/issues/88484 Sparse Tensor: ```python >>> a = torch.tensor([[0, 2.], [3j, 0]]).to_sparse() >>> a.conj().is_conj() False >>> a._neg_view() Traceback (most recent call last): File "<stdin>", line 1, in <module> NotImplementedError: Cannot access storage of SparseTensorImpl ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/88182 Approved by: https://github.com/ezyang, https://github.com/anjali411	2022-11-09 17:15:12 +00:00
Nikita Shulga	e1c123d29a	Add UBSAN to ASAN (#88055 ) Add undefined behavior sanitizer to `USE_ASAN` option. Added `torch._C._crash_if_vptr_ubsan()` that only fails if vptr belongs to a wrong class after typecast Deleted all ubsan supressions, but disabled `ProtoTest::Basic` as it fails above-mentioned vptr check. Fixes https://github.com/pytorch/pytorch/issues/88042 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88055 Approved by: https://github.com/ezyang	2022-11-01 17:59:35 +00:00
Elias Ellison	fc21b9db23	Use Eager Code To Determine Conv Layout (#87305 ) The logic for determine conv backend and therefore output striding is very complex. It depends on build settings, input striding/contiguity, sizes, etc. Eventually we should port that logic to the meta impl for dynamic shapes but that will require a lot more work and keeping the implementations in sync. See https://github.com/pytorch/torchdynamo/issues/1701 This is a prerequisite to removing the inductor conv stride propagation and more general fake tensor for inductor propagation. In that PR, the meta impls for cpu conv give incorrect striding which led to test failures (https://github.com/pytorch/pytorch/pull/87083). Pull Request resolved: https://github.com/pytorch/pytorch/pull/87305 Approved by: https://github.com/ezyang	2022-10-28 16:37:04 +00:00
Driss Guessous	35c611d30f	Add mem efficient backend flag (#87946 ) # Summary Add in a torch.backends.cuda flag and update context manager to pic between the three implementations of the scaled_dot_product_attention. cc @cpuhrsch @jbschlosser @bhosmer @mikaylagawarecki Pull Request resolved: https://github.com/pytorch/pytorch/pull/87946 Approved by: https://github.com/cpuhrsch	2022-10-28 15:51:10 +00:00
soulitzer	adb76ef510	Expose API for backward execution order (#87507 ) In this PR: - graph_task stores graph roots on construction so that we can later traverse through the graph - before the nodes are returned, they needed to be converted from raw_ptr to shared_ptr, and this should be OK because the graph is guaranteed to be alive Pull Request resolved: https://github.com/pytorch/pytorch/pull/87507 Approved by: https://github.com/albanD	2022-10-26 21:28:45 +00:00
Brian Hirsh	ce0c6e828e	Reland "add an API for external backends to register custom device names (#86992 )" (#87453 ) Re-land of https://github.com/pytorch/pytorch/pull/86992 This reverts commit `a895af9250`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87453 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-10-21 16:51:36 +00:00
PyTorch MergeBot	a895af9250	Revert "add an API for external backends to register custom device names (#86992 )" This reverts commit `fb6826bfd8`. Reverted https://github.com/pytorch/pytorch/pull/86992 on behalf of https://github.com/jeanschmidt due to breaking internal builds - D40534212 - arstudio-windows-tests-landcastle-0	2022-10-20 14:51:08 +00:00
Brian Hirsh	fb6826bfd8	add an API for external backends to register custom device names (#86992 ) This API adds some improvements to external backends who are building C++ backends out of tree using the `PrivateUse1` dispatch key. The docs and linked examples go over the API in more detail, but you should be able to use it like: ``` # This should probably be in the __init__.py file of a external backend's python package > torch.register_privateuse1_backend("foo")` # And it will allow the user to do this: > a = torch.ones(2, device="foo") ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/86992 Approved by: https://github.com/albanD	2022-10-19 16:44:17 +00:00
Kurt Mohler	1dbc8ad3b7	Add `Warning` class and refactor C++ warnings to use it (#84101 ) Also adds `TORCH_WARN_WITH` and `TORCH_WARN_DEPRECATION` macros Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84101 Approved by: https://github.com/albanD	2022-10-18 20:02:42 +00:00
Jason Ansel	f1fdb6efbd	Manual changes for moving dynamo to core (#86621 ) This is the subset of the changes in #86461 not auto-generated by `copy_to_core.sh`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86621 Approved by: https://github.com/albanD	2022-10-11 23:01:21 +00:00
Elias Ellison	d3f7c34cb3	Enable aten-aten decomps (#85921 ) Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921 Approved by: https://github.com/ezyang	2022-10-08 05:12:42 +00:00
PyTorch MergeBot	7ec12a559c	Revert "Enable aten-aten decomps (#85921 )" This reverts commit `62e4f51efd`. Reverted https://github.com/pytorch/pytorch/pull/85921 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. I think it breaks a dynamo test in trunk `62e4f51efd`	2022-10-08 01:59:54 +00:00
soulitzer	ba3fde6aa0	Add multi-grad hooks (#86260 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86260 Approved by: https://github.com/albanD	2022-10-07 21:16:45 +00:00
Elias Ellison	62e4f51efd	Enable aten-aten decomps (#85921 ) Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921 Approved by: https://github.com/ezyang	2022-10-07 21:04:39 +00:00
Sahan Paliskara	936e93058b	Delete torch::deploy from pytorch core (#85953 ) As we have migrated torch::deploy over to https://github.com/pytorch/multipy, we can now delete it from pytorch core as ongoing development will happen there. This PR was created due to syncing issues with https://github.com/pytorch/pytorch/pull/85443 which is where the review history can be found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85953 Approved by: https://github.com/seemethere, https://github.com/malfet	2022-10-06 07:20:16 +00:00
Elias Ellison	9da5646cdb	Add device logic handling for functions which allow scalar inputs as tensors (#86149 ) Some functions allow scalars as tensor inputs. Add handling for them in device logic. Fix for https://github.com/pytorch/torchdynamo/issues/1445 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86149 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2022-10-04 18:54:00 +00:00
Driss Guessous	cd6477617c	Custom sdp implementations dense (#85984 ) # Summary - This code creates the runtime dispatch system for choosing a performant fused SDP kernel. The only choice of fused kernel is flash_attention. It also creates python flags and a context manager that can be used to turn off and on behavior for dispatch. - This also adds support for flash_attention with dense tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85984 Approved by: https://github.com/cpuhrsch	2022-10-03 17:36:37 +00:00
Elias Ellison	f183a989a2	Fix fake tensor kernel nesting (#85920 ) If you e.g. printed within a decomp which would call `in_kernel_invocation_manager`, on the exit from the manager it would unilaterally remove meta from the tls / set the tensor to return its real device. We should just restore what the existing state was. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85920 Approved by: https://github.com/ezyang, https://github.com/bdhirsh, https://github.com/huydhn	2022-10-02 04:19:40 +00:00
PyTorch MergeBot	b562987c28	Revert "Fix fake tensor kernel nesting (#85920 )" This reverts commit `c2d9ea7f4b`. Reverted https://github.com/pytorch/pytorch/pull/85920 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but I suspect that it causes a flaky memory leak issue in TestFakeTensorCUDA.test_fake_crossref_backward_amp_linalg_lstsq_cuda_float32	2022-10-01 19:30:21 +00:00
Elias Ellison	c2d9ea7f4b	Fix fake tensor kernel nesting (#85920 ) If you e.g. printed within a decomp which would call `in_kernel_invocation_manager`, on the exit from the manager it would unilaterally remove meta from the tls / set the tensor to return its real device. We should just restore what the existing state was. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85920 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2022-09-30 23:11:20 +00:00
soulitzer	a876432aea	Expose torch._will_engine_execute_node (#84773 ) Addresses: https://github.com/pytorch/pytorch/issues/83617 This PR a way to query the TLS graph task's exec_info which is a map mapping the Node to a bool indicating whether it will be executed in the current backward pass (as determined by the inputs= argument for .grad of .backward). - this works with both custom Function nodes and normal codegened nodes - to be able to verify whether the pyobject passed is an actual node, we now store pointers to PyTypeObjects into a set on registration. - error out when .backward without inputs= to avoid silently returning True Alternatives: - not sure if it is possible to bind to Python from a raw pointer to Node. At least we wouldn't be able to use existing logic, and the Python object should only hold a weak reference to the Node. - other solutions to the motivating issue seem to require more extensive modification to the engine See the issue linked for an example of usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/84773 Approved by: https://github.com/albanD	2022-09-28 20:13:52 +00:00
Sherlock Huang	01dbbeeeb5	Expose cpp_backtrace to python binding (#84896 ) We can now get cpp stack trace by calling torch.utils.get_cpp_backtrace() Sample output when calling from a torch_dispatch stack: ``` <omitting python frames> frame #23: torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object, _object, char const, _object, char const, torch::TorchFunctionName) (0x7f69330bab90 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/utils/python_arg_parser.cpp:323) frame #24: <unknown function> (0x7f6932a09e79 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/python_variable.cpp:2252) frame #25: <unknown function> (0x7f69261aee33 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:56) frame #26: <unknown function> (0x7f69261afef9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:19) frame #27: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41) frame #28: <unknown function> (0x7f6926fae9b9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/boxing.h:227) frame #29: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821f5 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:106) frame #30: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438) frame #31: <unknown function> (0x7f692ae4f8be in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1361) frame #32: <unknown function> (0x7f692ae4f9b1 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1362) frame #33: <unknown function> (0x7f692aef77e9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13) frame #34: <unknown function> (0x7f6926fae7d8 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:50) frame #35: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821c9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:97) frame #36: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438) frame #37: <unknown function> (0x7f6929ec654a in /fsx/users/bahuang/repos/pytorch_fsx/build/aten/src/ATen/RedispatchFunctions.h:10697) frame #38: <unknown function> (0x7f6929d9edae in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2837) frame #39: <unknown function> (0x7f6929d9f043 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2838) frame #40: <unknown function> (0x7f6929e7d2f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13) frame #41: <unknown function> (0x7f6929eb1344 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:478) frame #42: <unknown function> (0x7f6929ea7b99 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:490) frame #43: <unknown function> (0x7f6929e7d370 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:563) frame #44: <unknown function> (0x7f6929e7d43a in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:239) frame #45: <unknown function> (0x7f6929e7d48c in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:364) frame #46: <unknown function> (0x7f6929e7d50a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:554) frame #47: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41) frame #48: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43) frame #49: c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const (0x7f692603890a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:652) frame #50: <unknown function> (0x7f69260387f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:388) frame #51: <unknown function> (0x7f69261af0ef in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:96) frame #52: <unknown function> (0x7f69261aff2b in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:25) frame #53: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41) frame #54: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43) frame #55: c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >) const (0x7f6925fd6ab2 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:628) frame #56: <unknown function> (0x7f6925fd6690 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:376) frame #57: <unknown function> (0x7f692bf5b525 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:380) frame #58: <unknown function> (0x7f692bf59fac in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/runtime/register_c10_ops.cpp:15) frame #59: <unknown function> (0x7f692bf5af41 in /usr/include/c++/7/bits/std_function.h:316) frame #60: std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const (0x7f6932ab9a0f in /usr/include/c++/7/bits/std_function.h:706) frame #61: <unknown function> (0x7f6932aad541 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/stack.h:41) frame #62: <unknown function> (0x7f6932ab3102 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1206 (discriminator 1)) frame #63: <unknown function> (0x7f6932ab3943 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1272) frame #64: <unknown function> (0x7f6932a46120 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/init.cpp:1767) frame #65: <unknown function> (0x7f6932a997be in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1441) frame #66: <unknown function> (0x7f6932a8a985 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1410) frame #67: <unknown function> (0x7f6932a66e1e in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:249) frame #68: <unknown function> (0x7f6932a66ec2 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:224) frame #69: <unknown function> (0x7f6932473111 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:929) frame #104: __libc_start_main (0x7f693485dc87 in /build/glibc-uZu3wS/glibc-2.27/csu/../csu/libc-start.c:310) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84896 Approved by: https://github.com/ezyang	2022-09-27 14:59:08 +00:00
Elias Ellison	bcc544e9d7	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-26 17:08:14 +00:00
PyTorch MergeBot	d10de31cc8	Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 )" This reverts commit `78afa0cf0c`. Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk `78afa0cf0c`	2022-09-23 17:21:43 +00:00
Elias Ellison	78afa0cf0c	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-23 15:50:03 +00:00
Richard Zou	5e5c319549	Move functorch python bindings to torch/csrc (#85426 ) This moves functorch's python bindings to torch/csrc/functorch/init.cpp. Coming next is the torchdim move. I didn't do torchdim yet because moving functorch's python bindings unblocks some other things that I want to do first. Test Plan: - tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/85426 Approved by: https://github.com/ezyang	2022-09-22 18:47:12 +00:00
PyTorch MergeBot	5043457a8e	Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 )" This reverts commit `9c77083965`. Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk (and pull somehow) `9c77083965`	2022-09-22 15:44:38 +00:00
Elias Ellison	9c77083965	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-22 13:03:57 +00:00

1 2 3 4 5 ...

418 Commits