pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
rzou	7cc07a3eb1	[custom_op] stop using nonlocals to store information (#128547 ) Fixes https://github.com/pytorch/pytorch/issues/128544 Fixes https://github.com/pytorch/pytorch/issues/128535 We had a problem with multithreading where the nonlocals were being clobbered. In the first place, we stored these nonlocals because we wanted to ferry information from an autograd.Function.apply to autograd.Function.forward. Our new approach is: - pass the information directly as an input to the autograd.Function.apply. This means that the autograd.Function.forward will receive the information too. - this messes up ctx.needs_input_grad, which has an element per input to forward. The user should not see the additional information we passed. We fix this by temporarily overriding ctx.needs_input_grad to the right thing. - this exposed a bug in that ctx.needs_input_grad wasn't correct for TensorList inputs. This PR fixes that too. Test Plan: - existing and new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/128547 Approved by: https://github.com/williamwen42, https://github.com/soulitzer	2024-06-13 13:36:39 +00:00
rzou	6412c6060c	[reland] Refresh OpOverloadPacket if a new OpOverload gets added (#128000 ) If a user accesses an OpOverloadPacket, then creates a new OpOverload, then uses the OpOverloadPacket, the new OpOverload never gets hit. This is because OpOverloadPacket caches OpOverloads when it is constructed. This PR fixes the problem by "refreshing" the OpOverloadPacket if a new OpOverload gets constructed and the OpOverloadPacket exists. Test Plan: - new tests This is the third land attempt. The first one was reverted for breaking internal tests, the second was reverted for being erroneously suspected of causing a perf regression. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128000 Approved by: https://github.com/albanD	2024-06-05 17:57:09 +00:00
Sam Larsen	82a370ae3a	Revert "Refresh OpOverloadPacket if a new OpOverload gets added (#126863 )" (#127366 ) This reverts commit `ed734178ab`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127366 Approved by: https://github.com/zou3519	2024-05-29 19:26:06 +00:00
rzou	28de9143a3	opcheck should be usable without optional dependencies (#127292 ) This PR excises opcheck's dependency on torch.testing._internal.common_utils, (which comes with dependencies on expecttest and hypothesis). We do this by moving what we need to torch.testing._utils and adding a test for it. Fixes #126870, #126871 Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/127292 Approved by: https://github.com/williamwen42 ghstack dependencies: #127291	2024-05-29 17:17:49 +00:00
William Wen	5359af0c7e	[dynamo] wrap GraphModule exceptions in dynamo-wrapped tests (#126341 ) Better approach to https://github.com/pytorch/pytorch/pull/126197 to catch issues like https://github.com/pytorch/pytorch/issues/125568. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126341 Approved by: https://github.com/anijain2305, https://github.com/jansel	2024-05-29 05:18:04 +00:00
Xuehai Pan	26f4f10ac8	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980	2024-05-27 14:49:57 +00:00
PyTorch MergeBot	55c0ab2887	Revert "[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 )" This reverts commit `7763c83af6`. Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))	2024-05-27 09:22:08 +00:00
Xuehai Pan	7763c83af6	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980 ghstack dependencies: #127122, #127123, #127124, #127125	2024-05-27 04:22:18 +00:00
Xuehai Pan	a28bfb5ed5	[4/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort functorch (#127125 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127125 Approved by: https://github.com/Skylion007 ghstack dependencies: #127122, #127123, #127124	2024-05-25 22:45:38 +00:00
Richard Zou	f8857cef45	[Reland] Verify types in custom op schemas (#126861 ) Summary: co-dev reland of https://github.com/pytorch/pytorch/pull/124520, which requires the removal of some executorch tests. Before this PR, we didn't check that types in a schema were valid. This is because TorchScript treats unknown types as type variables. This PR checks types in a schema for the TORCH_LIBRARY APIs. To do this, we add an `allow_typevars` flag to parseSchema so that TorchScript can use allow_typevars=True. We also add some error messages for common mistakes (e.g. using int64_t or double in schema). Test Plan: Wait for tests Differential Revision: D57666659 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126861 Approved by: https://github.com/albanD	2024-05-23 19:53:52 +00:00
rzou	ed734178ab	Refresh OpOverloadPacket if a new OpOverload gets added (#126863 ) If a user accesses an OpOverloadPacket, then creates a new OpOverload, then uses the OpOverloadPacket, the new OpOverload never gets hit. This is because OpOverloadPacket caches OpOverloads when it is constructed. This PR fixes the problem by "refreshing" the OpOverloadPacket if a new OpOverload gets constructed and the OpOverloadPacket exists. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/126863 Approved by: https://github.com/albanD	2024-05-22 14:13:27 +00:00
Brian Hirsh	f25c7c9699	functionalize storage resizing, minimal ppFSDP traceable forward (#122434 ) More details further down, but first a more high-level description of "how do we functionalize storage resizing" Today, dynamo converts `param.untyped_storage().resize_(x)` calls that it sees from fsdp into a custom op, `ops.inductor.resize_storage_bytes_(x)` So given this setup, there are 3 main cases that I think we want to handle: (1) graph input starts with a real storage size, gets resized down to zero in the graph (2) graph input starts with 0 storage size, gets resized up in the graph (3) graph input starts with 0 storage size, gets resized up and used in some compute, then resized back down to 0 For case (1) we need to emit a `resize_storage_bytes_` at the end of the graph, similar to how we emit `copy_()` for data mutations. For case (2), we need to emit a `resize_storage_bytes_` in the graph, and we also need to emit a `copy_()` (the input had its storage resized up, and filled in with data, which is we need to reflect as an input mutation) For case (3), the net effect is that the input had no data on entry and exit of the function, so we don't need to emit any mutable ops in the end of the graph. The main thing to call out is that: we need to write a functionalization rule for `resize_storage_byte_`, (`FunctionalTensorWrapper::storage_resize_()`) and this rule actually does very little. We would like to not emit any new ops in the graph (like say, a functional resize op). Instead, we should expect / rely on the fact that any resize up will be immediately followed by a `copy_()`/`foreach_copy_`/`out=` op, that will fill in the data of the tensor. So `FunctionalTensor` can temporarily live in a state where its data is invalid, until the `x.copy_(y)` "updates" its data with the new tensor. So effectively, all that this rule does is: (1) it stores metadata on the storage, indicating that the tensor was resized, as well as the updated storage size. We need this info in AOTAutograd, so it knows whether to emit a mutable resize_() op in the graph epilogue (2) There is also a corner case: if we are resizing down to zero, but our tensor had previously had a zero size storage, then we update `value_` to point to the original value of the tensor. The reason this seems safe is because if we have a zero storage sized tensor `x`, and we resize it up, use it in some compute, resize it back down to zero, and use it somewhere, we would want the functional version of this code to use the original `x` after the second resize. For FSDP, this is important because we end up saving parameters (graph inputs) for backward, and we want to make sure that the thing we save (and the output to the forward graph) is the original, zero-storage-sized parameter, and not the "version 2" of the parameter after the first resize_() I think a good order to look at changes in this PR would be: (1) `test_aotdispatch.py` shows the 3 main cases I focused on as well as the expected functionalized graphs (2) In `FunctionalStorageImpl.h/cpp`, I had to add a notion of "original base", and "original/curr_size". The first is so I can re-use the zero-size tensor after multiple resizes, and the second is so I can tell in AOTAutograd whether any resizes canceled each other out into a no-op (3) FunctionalTensorWrapper.h/cpp has the new resize functionalizion rule + some extra utils (4) `_functorch/_autograd`: the main changes in this folder were around adding the logic at trace-time to detect when we need to put a resize_() in the graph. I also have some assertions to check that any inputs that experience storage resizing will always be in the graph and not the opaque epilogue, and I also limited the resize_() mutation case so that you can only ever start with zero storage, or end with zero storage (you can't do e.g. `torch.ones(2).storage().resize_(3)`), and banned it on tensor subclasses (5) `fake_tensor.py`/`meta_utils.py`: we now need to be able to fakeify tensors with zero storage, so I added a quick version of it in meta_utils.py. This also.. has ramifications for fake tensor caching that I need to fix (include the storage size on the cache key, maybe?) ------------------ This PR subsumes https://github.com/pytorch/pytorch/pull/120971. This PR is enough to almost get a simple ppFSDP forward pass tracing with a functionalized resize_() properly. It also attempts to do the updated version from @jansel, where we don't have any notion of `resize_()` in the graph at all, post functionalization. It would probably be good to test it with @yf225 's FSDP changes, and see how many of the FX passes it allows us to remove. I think that in theory, it should allow us to remove all FX passes that affect the forward graph / partitioner, except the one that forces views to be recomputed in the backward (more details below). There are a few things worth calling out: (1) failed attempt at functionalizing `aten.copy_()`. I originally wanted to get a version takes these operations: ``` param.storage().resize_(all_gather_size) param.copy_(all_gather_buffer) out = aten.matmul(param, param) ``` and functionalizes them into: ``` out = aten.matmul(all_gather_buffer, all_gather_buffer) ``` This would involve getting functionalization to turn `x.copy_(y)` into a giant no-op that just returns `y`. Unfortunately, we can't actually do this in a reasonable way within functionalization (instead, there's a functional `aten.copy` in the graph - see the test case graph expecttest for details). Why? In order for that transformation to be safe, `x` and `y` need to have the same metadata. However, it's possible for `x` and `y` to be subclasses of different types. This is not something we can easily tell from within functionalization, and would be a layering violation. So for now I'm leaving it to downstream code to optimize away the `aten.copy` (this is already the case today, so I think inductor can handle this) (2) The forward doesn't actually run successfully in this PR (see the `assertRaisesRegex` in the test). Why? The final forward graph looks like this: ``` def forward(self, primals_1, primals_2): _foreach_copy = torch.ops.aten._foreach_copy.default([primals_1], [primals_2]); primals_2 = None getitem = _foreach_copy[0]; _foreach_copy = None mm = torch.ops.aten.mm.default(getitem, getitem); getitem = None t_1 = torch.ops.aten.t.default(primals_1); primals_1 = None return [mm, t_1] ``` Where `primals_1` starts out as a secretly-zero-storage-size parameter, and gets resized up and back down within the forward (these are functionalized away). Importantly, the matmul happy on the result of the `foreach_copy`, but the activation that we save for backward (`t_1`) is the result of transposing the original parameter (the zero-storage-size param). This is exactly the optimization in fsdp that allows us to have good peak memory usage. The problem is that the min-cut partitioner decides to save `t_1` for backward. Running this code in eager breaks, because the kernel for `aten.permute(x)` is not happy when `x` has secretly-zero-sized-storage. The real problem here is that in eager mode the `permute` kernel runs during the backward, after backward hooks have properly resized the saved activation. Here, we are running the transpose in the forward. One option would be to turn off the checks in our view kernels and allow them to work on zero-storage-sized tensors, which feels pretty bad. Another option is to tweak the partitioner (or use one of Will's FX passes) to force the partitioner to not save views for backward, and allow the views to be recomputed in the backward. This seems kind of silly, but is also probably harmless. (3) The backward is still broken. To be fair, this issue is pretty separable from "functionalizing storage resize calls", and can be fixed later (either by a real fix to our tracing infra, or via another hacky FX pass). More description of this problem is described at issue (8) of my PR description in https://github.com/pytorch/pytorch/pull/120971 (4) I only added support for "full graph" resizing: basically, the limited case where a param starts with zero storage size, and gets resized up and back down. I think we can add support for the graph break case, but I think we can keep that add-on separate from this PR unless we need it immediately. I also added asserts so we should fail loudly when we hit this case (5) I have a change to FakeTensor creation when inputs have zero storage size that.. is probably ok. But I also removed FakeTensor caching on view ops, which I probably need to fix before I can land this PR (6) I added a notion of "original_base" to `FunctionalStorageImpl`. More details are in the comments, but my rational for this was that we basically need it to ensure that autograd saves the original, zero-storage-sized param for backward, after resizing up and back down (7) I had to update our eager kernels for `aten.copy` and `aten._foreach_copy`, to handle the case where the `self` argument has secretly-zero-storage. Inductor can probably generate correct code for this case, but we need these ops to work properly in this situation for the `aot_eager` backend to do the right thing Pull Request resolved: https://github.com/pytorch/pytorch/pull/122434 Approved by: https://github.com/jansel	2024-05-10 18:09:10 +00:00
rzou	c6b7504d47	Fix torch.library.register_fake's module reporting (#125037 ) torch.library.register_fake reports the python module the fake impl is located in. This is used to check against `m.set_python_module("foo.bar")` calls in C++. The module reporting logic was wrong in most cases. This PR fixes it. Test Plan: - exhaustive tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/125037 Approved by: https://github.com/williamwen42	2024-04-26 20:53:33 +00:00
PyTorch MergeBot	35a82d4a4a	Revert "Refresh OpOverloadPacket if a new OpOverload gets added (#124654 )" This reverts commit `872eeb0d7d`. Reverted https://github.com/pytorch/pytorch/pull/124654 on behalf of https://github.com/jeanschmidt due to Broken lots of internal signals, check D56571345 for more details ([comment](https://github.com/pytorch/pytorch/pull/124654#issuecomment-2078940680))	2024-04-26 08:56:03 +00:00
PyTorch MergeBot	a46c27d961	Revert "Verify types in custom op schemas (#124520 )" This reverts commit `141888765b`. Reverted https://github.com/pytorch/pytorch/pull/124520 on behalf of https://github.com/jeanschmidt due to Breaking internal tests check D56588015 for more details ([comment](https://github.com/pytorch/pytorch/pull/124520#issuecomment-2078917978))	2024-04-26 08:42:11 +00:00
rzou	4e340a7f8b	[custom_op] setup_context fills in default values (#124852 ) This is to mirror autograd.Function's setup_context behavior. The PyTorch Dispatcher removes default values for "FC/BC reasons", but I convinced myself there's no FC/BC problem for the setup_context API. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124852 Approved by: https://github.com/albanD ghstack dependencies: #124637, #124805, #124806	2024-04-25 04:22:01 +00:00
Edward Z. Yang	9692b954c6	FakeTensorProp works with unbacked bindings (#124310 ) This is a partial revert of https://github.com/pytorch/pytorch/pull/124059 Like in #124297, profiling has revealed that testing equality on every output is kind of expensive. So we only test equality when we know there is an unbacked binding. This is the same playbook as the previous PR, just on FakeTensorProp instead of PropagateUnbackedSymInts. Note that we also need to populate `unbacked_bindings` in proxy_tensor.py, since we're generating an entirely new graph in that case. We now have enough propagation that we're able to trigger a bug related to divisibility replacement. In https://github.com/pytorch/pytorch/pull/113165 we allowed to replace `u0` with `u1 * c` for some constant c, when we have determined that u0 is divisible by c. However, where does the binding for u1 come from? What we will have in practice is that there is some node that is supposed to have bound u1, but which actually is getting a `u1 * c` in its output. So, to get u1, we must divide out c. Fortunately, under the divisibility condition, this is always possible (but remember, we must test divisibility at runtime!) Because we have tightened up asserts, it is now an error to allocate unbacked SymInts and then fail to track them under unbacked_bindings. In torch/_dynamo/eval_frame.py and torch/_functorch/_aot_autograd/collect_metadata_analysis.py there are examples of benign cases where we repropagated fake tensors but then immediately threw away the results. In these cases, it's not appropriate to rebind, since we're still using the old FX graph that has all of the old symbols. So we just manually clear it. It is possible that other cases will need to be updated, so this PR is "risky" from the perspective of hitting fbcode. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124310 Approved by: https://github.com/lezcano	2024-04-25 02:08:51 +00:00
rzou	141888765b	Verify types in custom op schemas (#124520 ) Before this PR, we didn't check that types in a schema were valid. This is because TorchScript treats unknown types as type variables. This PR checks types in a schema for the TORCH_LIBRARY APIs. To do this, we add an `allow_typevars` flag to parseSchema so that TorchScript can use allow_typevars=True. We also add some error messages for common mistakes (e.g. using int64_t or double in schema). Test Plan: - new tests Differential Revision: [D56432690](https://our.internmc.facebook.com/intern/diff/D56432690) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124520 Approved by: https://github.com/albanD	2024-04-25 01:56:58 +00:00
rzou	4f398eed0b	[custom_op] register_autograd supports non-tensor kwargonly-args (#124806 ) The user does not need to return gradients for these args. We also change how setup_context works to adapt to kwargonly-args. If the user's op has no kwonly-args, then their setup_context function must look like `setup_context(ctx, inputs, output)`: we require that the arguments have the same names. If the user's op has kwonly-args, then their setup_context function must look like `setup_context(ctx, inputs, keyword_only_inputs, output)`. We require that the arguments have the same names. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124806 Approved by: https://github.com/albanD, https://github.com/williamwen42 ghstack dependencies: #124637, #124805	2024-04-25 01:51:02 +00:00
rzou	31522391a8	[custom_op] Blanket ban kwarg-only Tensors (#124805 ) We can lift this if users ask for but I haven't seen an op that someone would use with this api that uses a kwarg-only Tensor yet Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124805 Approved by: https://github.com/albanD, https://github.com/williamwen42 ghstack dependencies: #124637	2024-04-25 01:51:02 +00:00
rzou	2b1c13e3a3	[custom_op] fix schema inference for kwarg-only args (#124637 ) Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124637 Approved by: https://github.com/williamwen42, https://github.com/albanD	2024-04-25 01:51:02 +00:00
rzou	872eeb0d7d	Refresh OpOverloadPacket if a new OpOverload gets added (#124654 ) If a user accesses an OpOverloadPacket, then creates a new OpOverload, then uses the OpOverloadPacket, the new OpOverload never gets hit. This is because OpOverloadPacket caches OpOverloads when it is constructed. This PR fixes the problem by "refreshing" the OpOverloadPacket if a new OpOverload gets constructed and the OpOverloadPacket exists. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124654 Approved by: https://github.com/albanD	2024-04-24 19:30:52 +00:00
PyTorch MergeBot	92295fbacd	Revert "Verify types in custom op schemas (#124520 )" This reverts commit `5b98d43488`. Reverted https://github.com/pytorch/pytorch/pull/124520 on behalf of https://github.com/zou3519 due to broke static runtime tests ([comment](https://github.com/pytorch/pytorch/pull/124520#issuecomment-2075111935))	2024-04-24 14:41:26 +00:00
rzou	4ceb44c40d	Add torch.library.opcheck (#124496 ) This PR: - exposes torch.testing._internal.optests.opcheck as torch.library.opcheck - Adds support for CustomOpDef (aka functions decorated with torch.library.custom_op) to opcheck. Test Plan: - Updated tests - We validated opcheck's design internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124496 Approved by: https://github.com/williamwen42	2024-04-23 21:48:00 +00:00
rzou	5b98d43488	Verify types in custom op schemas (#124520 ) Before this PR, we didn't check that types in a schema were valid. This is because TorchScript treats unknown types as type variables. This PR checks types in a schema for the TORCH_LIBRARY APIs. To do this, we add an `allow_typevars` flag to parseSchema so that TorchScript can use allow_typevars=True. We also add some error messages for common mistakes (e.g. using int64_t or double in schema). Test Plan: - new tests Differential Revision: [D56432690](https://our.internmc.facebook.com/intern/diff/D56432690) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124520 Approved by: https://github.com/albanD	2024-04-23 14:18:35 +00:00
rzou	f0560f7b3b	[opcheck] Stop doing test_aot_dispatch_static by default (#124495 ) Motivations: - this is pretty redundant with test_aot_dispatch_dynamic. - The user story for opcheck is that a user should use opcheck to see if their operator was "registered correctly". If a user's custom op only supports dynamic shapes, then it's a bit awkward for one of the tests (e.g. `test_aot_dispatch_static`) to fail. - We've already stopped running test_aot_dispatch_static in all of our opcheck tests. Test Plan: - wait for CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/124495 Approved by: https://github.com/williamwen42 ghstack dependencies: #124180, #124200, #124299, #124134, #124199, #124403, #124414	2024-04-19 21:57:22 +00:00
rzou	25c65d6642	Change register_autograd to reflect ordering of setup_context and backward (#124403 ) old: `register_autograd(setup_context, backward, /)` new: `register_autograd(backward, /, *, setup_context=None)` Motivations: - We introduce these APIs as "give us a backward and use setup_context to save things for backward". - setup_context isn't always necessary. Test Plan: - tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124403 Approved by: https://github.com/albanD ghstack dependencies: #124180, #124200, #124299, #124134, #124199	2024-04-19 17:56:30 +00:00
rzou	a8e17b2d4d	Move schema inference to torch._library (#124199 ) After this PR, we can delete torch._custom_op/torch._custom_ops (except there are external libraries depending it). Pull Request resolved: https://github.com/pytorch/pytorch/pull/124199 Approved by: https://github.com/albanD ghstack dependencies: #124180, #124200, #124299, #124134	2024-04-19 17:56:30 +00:00
rzou	bad8d25881	Add torch.library.register_kernel (#124299 ) This mirrors the .register_kernel method on the object produced by the custom_op decorator. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124299 Approved by: https://github.com/albanD ghstack dependencies: #124180, #124200	2024-04-19 13:54:21 +00:00
rzou	3918dfedc5	[custom_op] Rename register_impl to register_kernel (#124200 ) Motivation: - The API is used for registering an implementation for a specific device type. - "impl" is ambiguous and can be confused with Library.impl. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124200 Approved by: https://github.com/albanD ghstack dependencies: #124180	2024-04-19 13:54:21 +00:00
rzou	22a2f676c3	[custom_op] add ability to provide manual schema (#124180 ) Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124180 Approved by: https://github.com/albanD	2024-04-19 13:54:13 +00:00
rzou	645173a0b5	Add torch.library.register_autograd (#124071 ) Allows registering autograd for all custom op entry points: - the new-style custom op API (custom_op) - the old-style torch.library APIs - C++ operator registration Test Plan: - tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124071 Approved by: https://github.com/albanD ghstack dependencies: #123937, #124064, #124065, #124066	2024-04-18 12:47:59 +00:00
rzou	8135c4b921	torch.library.register_fake now accepts more types (#124066 ) We allow it to accept: - a string with the op name - an opoverload - a new-style custom op If any of these are referring to a new-style custom op (created with the custom_op decorator), then we dispatch to CustomOpDef.register_fake. Otherwise, we do what we previously did. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/124066 Approved by: https://github.com/albanD ghstack dependencies: #123937, #124064, #124065	2024-04-18 12:47:55 +00:00
Yanan Cao (PyTorch)	27daa110c8	Back out "Refresh OpOverloadPacket if a new OpOverload gets added (#123578 )" (#124324 ) Summary: Original commit changeset: 528276bc8a92 Original Phabricator Diff: D56057952 Differential Revision: D56271240 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124324 Approved by: https://github.com/davidberard98	2024-04-18 03:33:54 +00:00
Xuehai Pan	93e249969b	[BE] enable `ruff` rule `RSE` and remove useless parentheses in `raise` statements (#124261 ) Remove useless parentheses in `raise` statements if the exception type is raised with no argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124261 Approved by: https://github.com/albanD	2024-04-17 19:29:34 +00:00
rzou	47dbfecd37	Rename impl_abstract to register_fake, part 1/2 (#123937 ) This PR: - adds a new torch.library.register_fake and deprecates torch.library.impl_abstract. The motivation is that we have a lot of confusion around the naming so we are going to align the naming with the actual subsystem (FakeTensor). - renames `m.impl_abstract_pystub("fbgemm_gpu.sparse_ops")` to `m.has_python_registration("fbgemm_gpu.sparse_ops")`. No deprecation here yet; I need to test how this works with static initialization. - Renames a bunch of internals to match (e.g. abstractimplpystub -> pystub) I'm scared to rename the Python-side internal APIs (e.g. torch._library.abstract_impl) because of torch.package concerns. I'll do that in its own isolated PR next just in case it causes problems. DEPRECATION NOTE: torch.library.impl_abstract was renamed to to torch.library.register_fake. Please use register_fake. We'll delete impl_abstract in a future version of PyTorch. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123937 Approved by: https://github.com/albanD	2024-04-17 12:46:01 +00:00
rzou	a03711d24d	[custom_ops] Support TensorList inputs/outputs (#123615 ) We add a `supports_tensorlist` decorator that gives an autograd.Function the ability to handle TensorLists. Test Plan: - custom_op_db tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123615 Approved by: https://github.com/albanD	2024-04-15 23:32:43 +00:00
rzou	1b4419dc4d	Refresh OpOverloadPacket if a new OpOverload gets added (#123578 ) If a user accesses an OpOverloadPacket, then creates a new OpOverload, then uses the OpOverloadPacket, the new OpOverload never gets hit. This is because OpOverloadPacket caches OpOverloads when it is constructed. This PR fixes the problem by "refreshing" the OpOverloadPacket if a new OpOverload gets constructed and the OpOverloadPacket exists. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123578 Approved by: https://github.com/albanD ghstack dependencies: #123453	2024-04-11 13:18:06 +00:00
rzou	8a5e7a01b5	[custom_op] Schema inference now includes default values (#123453 ) If the function has default values, we should be able to do schema inference and put the default values into the schema. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123453 Approved by: https://github.com/albanD	2024-04-11 13:18:02 +00:00
rzou	cd6c58baea	[custom_ops] mutated_args -> mutates_args (#123437 ) This seemed better, since when you're construction a custom op you need to provide "the args that the custom op mutates". Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123437 Approved by: https://github.com/albanD ghstack dependencies: #123108, #123109, #123110, #123129	2024-04-05 22:03:51 +00:00
rzou	81e7a7c955	Add mutated_args field to custom_op (#123129 ) If provided, we: - autogenerate an ADInplaceOrView implementation - assume that no mutated inputs are returned as outputs. There are already aliasing runtime checks that check this. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123129 Approved by: https://github.com/albanD ghstack dependencies: #123108, #123109, #123110	2024-04-05 22:03:51 +00:00
rzou	9e8d2b6de2	Add register_autograd to register backward formulas for custom ops (#123110 ) The user provides a `setup_context` and a `backward_function`. These get put into a torch.autograd.Function that gets registered as the custom op's autograd implementation. Test Plan: - we update custom ops in the custom_op_db to use the new register_autograd API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123110 Approved by: https://github.com/albanD ghstack dependencies: #123108, #123109	2024-04-05 22:03:47 +00:00
rzou	d8e1c1087d	Add is_tensorlist_like_type helper (#123109 ) Checks if the type of an argument in a schema is some form of TensorList. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/123109 Approved by: https://github.com/albanD ghstack dependencies: #123108	2024-04-05 22:03:42 +00:00
rzou	067851dd0d	Expand is_functional_schema to work with torch._C._FunctionSchema (#123108 ) Previously it worked with torchgen.model.FunctionSchema. This PR extends it to work with torch._C._FunctionSchema by making torchgen.model.FunctionSchema look more like torch._C._FunctionSchema. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123108 Approved by: https://github.com/albanD	2024-04-05 22:03:39 +00:00
William Wen	cbde0f048b	[dynamo, 3.12] enable tests disabled due to missing dynamo 3.12 support (#123300 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123300 Approved by: https://github.com/jansel, https://github.com/malfet, https://github.com/zou3519	2024-04-05 20:13:17 +00:00
rzou	8f20cf1c71	Update the functionalization error message (#123261 ) Previously, it suggested that a user add a manual functionalization kernel. However, since we have auto_functionalize now, the user's first course of action should be to modify their op into the form that auto_functionalize accepts (this is possible in the majority of custom ops). Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/123261 Approved by: https://github.com/williamwen42	2024-04-04 16:20:42 +00:00
rzou	44c0c0fc0f	Add torch.library.custom_op (#122344 ) This is the entrypoint for defining an opaque/blackbox (e.g. PyTorch will never peek into it) custom op. In this PR, you can specify backend impls and the abstract impl for this op. NB: most of this PR is docstrings, please don't be intimidated by the line count. There are a number of interesting features: - we infer the schema from type hints. In a followup I add the ability to manually specify a schema. - name inference. The user needs to manually specify an op name for now. In a followup we add the ability to automatically infer a name (this is a little tricky). - custom_op registrations can override each other. This makes them more pleasant to work with in environments like colab. - we require that the outputs of the custom_op do not alias any inputs or each other. We enforce this via a runtime check, but can relax this into an opcheck test if it really matters in the future. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/122344 Approved by: https://github.com/ezyang, https://github.com/albanD	2024-04-03 18:36:17 +00:00
rzou	621fdc9db8	infer_schema can add alias annotations when passed a list of mutated args (#122343 ) Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/122343 Approved by: https://github.com/ezyang ghstack dependencies: #122319, #122320	2024-03-21 21:39:07 +00:00
rzou	639d6201b4	Expand the types infer_schema can infer (#122320 ) This PR allows it to infer: - None return as () - List[Tensor] as Tensor[] Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/122320 Approved by: https://github.com/ezyang, https://github.com/soulitzer ghstack dependencies: #122319	2024-03-21 21:39:07 +00:00
rzou	0dd78f1828	Add standalone tests for infer_schema (#122319 ) We're gonna reuse this helper in the new python custom ops API. Given a function with type annotations, `infer_schema(fun)` returns an inferred schema. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/122319 Approved by: https://github.com/ezyang, https://github.com/soulitzer	2024-03-21 21:39:04 +00:00
Simon Fan	8b1b61bc70	[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 ) - Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm - Include files more granularly to avoid namespace pollution and circular imports limitations: - requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness - will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash `b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)` - can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection - tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them. Differential Revision: [D54818488](https://our.internmc.facebook.com/intern/diff/D54818488) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681 Approved by: https://github.com/jansel	2024-03-13 21:13:21 +00:00
PyTorch MergeBot	b2f09c1859	Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 )" This reverts commit `d27509c384`. Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/xmfan due to breaking internal builds, see D54707287 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1989542344))	2024-03-11 22:18:36 +00:00
Simon Fan	d27509c384	[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 ) - Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm - Include files more granularly to avoid namespace pollution and circular imports limitations: - requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness - will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash `b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)` - can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection - tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681 Approved by: https://github.com/jansel	2024-03-08 20:43:29 +00:00
PyTorch MergeBot	2b1661c7a0	Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 )" This reverts commit `05c256849b`. Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/izaitsevfb due to breaking internal builds, see D54617701 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1984214079))	2024-03-07 18:53:51 +00:00
Simon Fan	05c256849b	[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 ) - Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm - Include files more granularly to avoid namespace pollution and circular imports limitations: - requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness - will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash `b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)` - can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection - tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681 Approved by: https://github.com/jansel	2024-03-06 18:01:56 +00:00
Catherine Lee	b3a9d677a3	[ez] Add super() calls in test_custom_ops (#121239 ) Some disable issues are getting spammed Check that test_impl_invalid_devices gets skipped by the disable issue Pull Request resolved: https://github.com/pytorch/pytorch/pull/121239 Approved by: https://github.com/zou3519	2024-03-05 21:16:06 +00:00
Simon Fan	d08ce51881	[compiled autograd] refactor eager test loading and run custom ops tests (#120679 ) TestCustomOp's tests uses helper attributes and functions from a util parent class. To support arbitrary test classes, we need to refactor the current approach. Instead of allowlisting certain methods, we can instead copy the whole class and only overwrite the "test_.*" methods. Compiled autograd fails on ~10/90 of the newly added tests. test_autograd_function_backed_op is the example we discussed in PT-2D meeting about requiring c++ autograd::Function support. I'm addressing this in #120732 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120679 Approved by: https://github.com/jansel, https://github.com/zou3519	2024-03-01 22:48:17 +00:00
atalman	244b124bb8	Add linux cpu test for 3.12 (#117853 ) This is continuation of work: https://github.com/pytorch/pytorch/pull/113987 Co-authored-by: albanD <desmaison.alban@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117853 Approved by: https://github.com/albanD	2024-02-14 20:52:23 +00:00
Sergii Dymchenko	bd9db6a9c7	Update to TorchFix 0.4.0 (#119424 ) `torch.library.Library` updated to `torch.library._scoped_library` in files with many tests where it seems obvious to do, otherwise `noqa: TOR901` added - see https://github.com/pytorch/pytorch/pull/118318 for more context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119424 Approved by: https://github.com/zou3519	2024-02-12 23:30:12 +00:00
Edward Z. Yang	0249c4a785	Add config toggle suggestions for data-dependent/dynamic output shape (#114337 ) Fixes https://github.com/pytorch/pytorch/issues/114220 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/114337 Approved by: https://github.com/aakhundov	2024-01-05 14:01:01 +00:00
youkaichao	16373bbc1f	fix error message in pytorch (#115349 ) Fixes https://dev-discuss.pytorch.org/t/typo-in-error-message/1709 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/115349 Approved by: https://github.com/Skylion007	2023-12-07 19:27:29 +00:00
rzou	b694f88ef6	Grandfather in built-in TorchScript ops to being pt2_compliant (#113061 ) I'm seeing ops like torch.ops.aten.mul.complex being used with torch.compile (though this seems strange to me), but we should grandfather these in. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/113061 Approved by: https://github.com/ezyang ghstack dependencies: #113050	2023-11-09 02:35:33 +00:00
PyTorch MergeBot	d98182e34e	Revert "Grandfather in built-in TorchScript ops to being pt2_compliant (#113061 )" This reverts commit `493b52b3d9`. Reverted https://github.com/pytorch/pytorch/pull/113061 on behalf of https://github.com/PaliC due to breaking internal tests - contacted author with errors ([comment](https://github.com/pytorch/pytorch/pull/113061#issuecomment-1802528592))	2023-11-08 19:36:41 +00:00
Richard Zou	d1c092ae1b	Update impl_abstract_pystub to be less boilerplatey (#113182 ) Summary: We've made the following changes: - The new way to use the API is `m.impl_abstract_pystub(module, context)`. Every subsequent m.def of an op inside the TORCH_LIBRARY block gives the op the `impl_abstract_pystub`. - Added a mechanism to determine if an operator was defined in Python or C++. Library.define in Python appends the op to a global set, which is analogous to what we do for tracking Library.impl. - If someone does `torch.library.impl_abstract` in Python for an operator, then we require that it has an `impl_abstract_pystub` specified and we also check that the module in the `impl_abstract_pystub` is the same as the module where the call to `torch.library.impl_abstract` exists. - Unfortunately we can't check the "context" (which is the buck target on buck-based systems) because buck sits above us. bypass-github-export-checks Test Plan: - existing tests Differential Revision: D51080493 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113182 Approved by: https://github.com/ezyang	2023-11-08 00:39:00 +00:00
PyTorch MergeBot	bc3e2e03cd	Revert "Update impl_abstract_pystub to be less boilerplatey (#112851 )" This reverts commit `6ae4e3a8d2`. Reverted https://github.com/pytorch/pytorch/pull/112851 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/112851#issuecomment-1799539354))	2023-11-07 18:53:13 +00:00
Richard Zou	6ae4e3a8d2	Update impl_abstract_pystub to be less boilerplatey (#112851 ) Summary: We've made the following changes: - The new way to use the API is `m.impl_abstract_pystub(module, context)`. Every subsequent m.def of an op inside the TORCH_LIBRARY block gives the op the `impl_abstract_pystub`. - Added a mechanism to determine if an operator was defined in Python or C++. Library.define in Python appends the op to a global set, which is analogous to what we do for tracking Library.impl. - If someone does `torch.library.impl_abstract` in Python for an operator, then we require that it has an `impl_abstract_pystub` specified and we also check that the module in the `impl_abstract_pystub` is the same as the module where the call to `torch.library.impl_abstract` exists. - Unfortunately we can't check the "context" (which is the buck target on buck-based systems) because buck sits above us. Test Plan: - existing tests Differential Revision: D50972148 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112851 Approved by: https://github.com/ezyang	2023-11-07 16:07:42 +00:00
rzou	493b52b3d9	Grandfather in built-in TorchScript ops to being pt2_compliant (#113061 ) I'm seeing ops like torch.ops.aten.mul.complex being used with torch.compile (though this seems strange to me), but we should grandfather these in. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/113061 Approved by: https://github.com/ezyang ghstack dependencies: #113049, #113050	2023-11-07 12:55:16 +00:00
PyTorch MergeBot	d94d72b397	Revert "Grandfather in built-in TorchScript ops to being pt2_compliant (#113061 )" This reverts commit `1d4d5e4319`. Reverted https://github.com/pytorch/pytorch/pull/113061 on behalf of https://github.com/clee2000 due to something in the stack broke distributed and inductor, pretty sure its the c10 one. Not sure why so many things were flaky on this PR ([comment](https://github.com/pytorch/pytorch/pull/113061#issuecomment-1797251293))	2023-11-07 02:28:14 +00:00
rzou	1d4d5e4319	Grandfather in built-in TorchScript ops to being pt2_compliant (#113061 ) I'm seeing ops like torch.ops.aten.mul.complex being used with torch.compile (though this seems strange to me), but we should grandfather these in. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/113061 Approved by: https://github.com/ezyang ghstack dependencies: #113036, #113049, #113050	2023-11-06 23:43:31 +00:00
rzou	71dca16610	Grandfather autogen'ed ops as pt2_compliant (#113036 ) Summary: I missed this when I grandfathered torchgen'ed aten ops as pt2_compliant. Test Plan: New test. Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/113036 Approved by: https://github.com/williamwen42	2023-11-06 23:43:17 +00:00
PaliC	542fa4a2e7	Revert "Revert "Use OpOverload instead of OpOverloadPacket for size/s… (#113058 ) Revert "Revert "Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119)"" This reverts commit `a1d1b73a7c`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113058 Approved by: https://github.com/izaitsevfb	2023-11-06 19:38:49 +00:00
PyTorch MergeBot	a1d1b73a7c	Revert "Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119 )" This reverts commit `2337d8d062`. Reverted https://github.com/pytorch/pytorch/pull/112119 on behalf of https://github.com/PaliC due to still breaking trt tests :( refer to diff ([comment](https://github.com/pytorch/pytorch/pull/112119#issuecomment-1795496395))	2023-11-06 17:01:50 +00:00
Richard Zou	185515368b	Add generated opcheck test for if the pt2_compliant_tag is incorrectly applied (#112759 ) Summary: If there are xfails in the failures_dict and the operator has the pt2_compliant_tag, then we raise an error. These generated tests are separate from those in the failures dict because we don't actually need any sample inputs to check this. Test Plan: - New tests Differential Revision: D50936201 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112759 Approved by: https://github.com/ezyang	2023-11-06 13:45:35 +00:00
Edward Z. Yang	2337d8d062	Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/112119 Approved by: https://github.com/yanboliang	2023-11-03 13:54:41 +00:00
PyTorch MergeBot	25e17f3522	Revert "Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119 )" This reverts commit `dd24e92949`. Reverted https://github.com/pytorch/pytorch/pull/112119 on behalf of https://github.com/ZainRizvi due to Breaking internal tests. See D50912326 ([comment](https://github.com/pytorch/pytorch/pull/112119#issuecomment-1791072363))	2023-11-02 16:32:25 +00:00
Edward Z. Yang	dd24e92949	Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/112119 Approved by: https://github.com/yanboliang	2023-11-01 18:26:01 +00:00
rzou	ae72607e5f	Add way to determine which overload an OpOverloadPacket will resolve to (#112199 ) The types are a bit weird (we accept and return a string) because there is not really a notion of OpOverloadPacket vs OpOverload in C++. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/112199 Approved by: https://github.com/ezyang ghstack dependencies: #112198	2023-10-29 15:36:14 +00:00
Richard Zou	bd0ea72b28	torch.library: Create helper function `is_functional_schema` (#111660 ) I will need this again soon. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/111660 Approved by: https://github.com/soulitzer	2023-10-27 15:20:25 +00:00
rzou	d91a18c433	Grandfather in torchgen'ed aten ops to torch.Tag.pt2_compliant_tag (#112053 ) In torchgen, we add the pt2_compliant_tag to all aten ops. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/112053 Approved by: https://github.com/soulitzer	2023-10-26 21:21:09 +00:00
rzou	3219b728b6	[torch.library] Clarify torch.library.define's schema (#111915 ) Unlike the previous torch.library.define, this schema doesn't take a name (the name is a part of the qualname). We separated out the qualname from the schema in the new APIs so that they're all consistent with each other (they all accept the qualname separately). Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/111915 Approved by: https://github.com/suo, https://github.com/ezyang ghstack dependencies: #111912	2023-10-25 21:20:54 +00:00
rzou	2d04be9a00	[torch.library] Add mechanism to add tags during define (#111912 ) We extend torch.library.Library.define and torch.library.define with a tags argument. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/111912 Approved by: https://github.com/ezyang	2023-10-25 21:20:48 +00:00
Richard Zou	66b74d231a	Change torch.library.impl to accept a device string (#111659 ) torch.library.impl now accepts a device string (e.g. "cpu", "cuda"). It still accepts DispatchKey strings, but we no longer document this, because using arbitrary DispatchKeys is more for the power users. We map the device string to a DispatchKey and then register the impl for said DispatchKey. A user may also specify multiple device strings at once or specify "types=default" to get a CompositeExplicitAutograd registration. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/111659 Approved by: https://github.com/soulitzer ghstack dependencies: #111380	2023-10-23 23:02:41 +00:00
Richard Zou	afb4914c3d	Align torch.library.impl with the new torch.library style (#111308 ) We add a new overload to torch.library.impl that accepts an optional Library arg. If provided, the lifetime of the registration will be tied to the Library arg, otherwise, it will live forever. Test Plan: - existing and new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/111308 Approved by: https://github.com/soulitzer ghstack dependencies: #111307	2023-10-16 22:32:23 +00:00
Richard Zou	9d9cc67592	Make torch.library.define consistent with the new APIs (#111307 ) This PR introduces a new overload of torch.library.define. Like impl_abstract, and our plans for the rest of the torch.library APIs, we allow it to accept an optional library object to tie the lifetime of the op definition to. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/111307 Approved by: https://github.com/soulitzer, https://github.com/ezyang	2023-10-16 22:32:23 +00:00
rzou	2cf9782912	[generate_opcheck_tests] Add some reasonable defaults (#110977 ) Summary: Make it easier to add `generate_opcheck_tests` by adding defaults for the failures_dict location, the additional decorators, and the test utils. Test Plan: Existing tests Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/110977 Approved by: https://github.com/williamwen42 ghstack dependencies: #110951	2023-10-11 14:28:05 +00:00
rzou	3a29cdc5e6	[optests] Add dontGenerateOpCheckTests and is_inside_opcheck_mode (#110951 ) This PR adds the following helper functions for generated opcheck tests: - dontGenerateOpCheckTests is a decorator that skips generation of the opcheck tests for the generated function - is_inside_opcheck_mode lets us query if we are in a generated test. Useful for fast debugging out-of-tree without needing to update PyTorch. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/110951 Approved by: https://github.com/williamwen42	2023-10-10 21:43:43 +00:00
rzou	1d0a8eed5d	[generate_opcheck_tests] Enable using same failures_dict for multiple testclasses (#110164 ) This PR allows us to use the same failures_dict for multiple test classes. This is helpful if you have a bunch of small TestCase(es) and to centralize all the failures dict into one big one. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/110164 Approved by: https://github.com/williamwen42	2023-09-28 17:56:45 +00:00
Richard Zou	bb9779ecd2	Revert D49640259: Revert D49615962: [optests] Test names in failure dicts should be prefixed with test class (#110094 ) Summary: Revert D49640259: Revert D49615962: [optests] Test names in failure dicts should Test Plan: revert-hammer Differential Revision: D49645397 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110094 Approved by: https://github.com/izaitsevfb	2023-09-26 21:16:36 +00:00
PyTorch MergeBot	2393864070	Revert "[optests] Test names in failure dicts should be prefixed with test class (#110045 )" This reverts commit `76fcec74c4`. Reverted https://github.com/pytorch/pytorch/pull/110045 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/110045#issuecomment-1735711094))	2023-09-26 14:56:08 +00:00
rzou	ea20db8aa0	[optests] Excise unused operator_compile_check (#110011 ) The recommendation is to just use `opcheck`, which has superceded all uses of `operator_compile_check`. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/110011 Approved by: https://github.com/ezyang ghstack dependencies: #109912	2023-09-26 13:24:21 +00:00
rzou	76fcec74c4	[optests] Test names in failure dicts should be prefixed with test class (#110045 ) We want to use the same failures dict for multiple TestCase. This happens common in e.g. fbgemm. To move towards that, we need to prefix each test name with their test class to avoid ambiguity Differential Revision: [D49615962](https://our.internmc.facebook.com/intern/diff/D49615962/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110045 Approved by: https://github.com/williamwen42	2023-09-26 03:21:12 +00:00
rzou	f8fcc54f70	Add torch.library.impl_abstract (#109912 ) Changelog: - torch.library.impl_abstract optionally accepts a torch.library.Library object. If passed in, then the lifetime of the registration is tied to the Library object. - we've also changed torch.library.impl_abstract to work on all operators, including overloads. - we refactored the `torch._custom_ops.` and `torch._custom_op.` impl_abstract APIs and put them under torch._library. This is the final resting place for them. I will follow-up with deleting all the `torch._custom_ops.` stuff later. - There is a new "SimpleOperatorRegistry" where we actually collect the abstract_impl. We will expand this to also hold the other torch._custom_ops. APIs when we move those to torch.library NB: Previously we had designed `impl_abstract` assuming a very high-level Python-only custom op API. We've revisited that since; now, impl_abstract works for all custom ops, no matter python or C++, no matter the schema. The new refactored design reflects this better. Test Plan: - existing and new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/109912 Approved by: https://github.com/ezyang	2023-09-26 01:59:50 +00:00
rzou	8124a6c40c	[TORCH_LIBRARY] Add impl_abstract_pystub (#109529 ) We want users to be able to define custom ops in C++ but put the abstract impl in Python (since it is easier to write them in Python and the abstract impl better models device semantics and data-dependent operators). `m.impl_abstract_pystub(opname, python_module, context)` declares the abstract_impl of the operator to exist in the given python module. When the abstract_impl needs to be accessed (either via FakeTensor or Meta), and it does not exist, the PyTorch Dispatcher will yell with a descriptive error message. Some details: - We construct a new global AbstractImplPyStub mapping in Dispatcher.cpp. Read/write to this map is protected by the Dispatcher lock. - We add a new Meta Tensor fallback kernel. The fallback errors out if there is no meta kernel, but also offers a nicer error message if we see that there is a pystub. - We create a `torch._utils_internal.throw_abstract_impl_not_imported_error` helper function to throw errors. This way, we can throw different error messages in OSS PyTorch vs internal PyTorch. To invoke this from C++, we added a PyInterpreter::throw_abstract_impl_not_imported_error. Differential Revision: [D49464753](https://our.internmc.facebook.com/intern/diff/D49464753/) Differential Revision: [D49464753](https://our.internmc.facebook.com/intern/diff/D49464753) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109529 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2023-09-22 04:55:36 +00:00
rzou	122264a0c0	[generate_opcheck_tests] tests should ignore meta/FakeTensors (#109641 ) These tests generally don't work on meta tensors because they need to compare the data of the Tensors. For example, SchemaCheckMode errors out if any inputs are meta or Fake because it needs to check their storages to see if any mutation occurred and those do not have storages. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/109641 Approved by: https://github.com/bdhirsh, https://github.com/soulitzer ghstack dependencies: #109637, #109638, #109639, #109640	2023-09-20 06:33:37 +00:00
rzou	d3d71367b9	[generate_opcheck_tests] Always print a repro (#109640 ) On failure of a test, we will always print a "repro". This repro isn't really runnable but gives the user a sense of how to actually reproduce the test without the test suite, because using the test suite is a bit convoluted. If the user passes PYTORCH_OPCHECK_PRINT_BETTER_REPRO, we will print a fuller repro that saves the exact problematic test inputs to disk and reads them back out. Test Plan: - expecttests on the generate_repro helper function - tried this out locally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109640 Approved by: https://github.com/bdhirsh, https://github.com/soulitzer ghstack dependencies: #109637, #109638, #109639	2023-09-20 06:33:37 +00:00
rzou	10d575911e	[generate_opcheck_tests] rename "success" to "xsuccess" (#109637 ) Not BC breaking because no existing failures dict have "success" in them. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/109637 Approved by: https://github.com/bdhirsh, https://github.com/soulitzer	2023-09-20 06:33:37 +00:00
ydwu4	94a54b89aa	[dynamo] Add BACKEND_MATCH guard to detect and recompile when backend changes (#107337 ) Motivation: We try to make torch.cond use torch.compile automatically so that we could error out when there is side-effects in the branches and correctly handle the closures. Before this PR, we have a warning if we don't turn on a config raise_on_backend_change (turning it on gives us an error) for the following code: ```python def foo() # Inside torch.cond, we'd like to do something like torch.compile(foo, backend="eager", fullgraph=True)(...) ... # Users may then call torch.compile somewhere else. # Dynamo will use the cached code of foo for "eager" backend # but we expect dynamo to recompile with "inductor" backend. torch.compile(foo, backend="inductor")(...) ``` This PR adds a BACKEND_MATCH guard. Effectively, it implements a per-backend cache. In the above example, the cached code for "eager" won't work for "inductor" due to guard check failures and the second torch.compile will do a re-compilation. In the future, it might be useful to have something like a configuration guard that guards against dynamo configuration changes across different compiles (e.g. compile a function with fullgraph=False then compile it again with fullgraph=True). Implementation: 1. We add a guarded_backend_cache and check the most_recent_backend against the backend associated with cached code. We also remove the raise_on_backend_change flag. Note: More lines are printed for debug log due to newly added context manager and guard adds . Test Plan: Removed original tests that raise on different backend and add a new test to test whether the BACKEND_MATCH guard can guard against backend change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107337 Approved by: https://github.com/jansel	2023-09-14 15:49:30 +00:00
Edward Z. Yang	55f956f1d2	optests improvements based on torchvision usage on nms (#108929 ) - Update cross-ref FakeMode test to use ShapeEnv. Dynamic ops can now return an unbacked SymInt. We always accept this as equal to whatever the real value was. - Relax test so it works on all classes, not just unittest.TestCase - Properly wrap the original method, so things like pytree.mark.parametrize are carried over - Support dynamic shapes by default for make_fx `tracing_mode="fake"` without symbolifying everything else Fixes https://github.com/pytorch/pytorch/issues/108927 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/108929 Approved by: https://github.com/zou3519	2023-09-13 13:26:15 +00:00
Richard Zou	bfa8429c6a	[optests] Changed failures_dict format to json; automatic update of failures_dict (#109110 ) We changed the failures_dict format from .py to json and added a way to automatically update the failures dict (the user can set PYTORCH_OPCHECK_ACCEPT=1 to do so), assuming the tests don't crash in the process. Some details: - We introduced a FailuresDict class that handles save/load and from which one can query a test status ("xfail", "skip", etc). - PYTORCH_OPCHECK_ACCEPT=1 does not override everything. In particular: it doesn't try to update the failures dict for a test marked as "skip", but it will update it for tests marked as "xfail" or "success". - PYTORCH_OPCHECK_ACCEPT=1 also does not override the "comment" field, unless it is flipping an "xfail" into "success". - I'll update the gdoc linked in the comments with how to actually use PYTORCH_OPCHECK_ACCEPT=1 internally (it's not trivial). Note that this isn't multithreading-safe, the current recommendation is to run the tests sequentially if the user wants to use PYTORCH_OPCHECK_ACCEPT=1. Differential Revision: D49167181 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109110 Approved by: https://github.com/ezyang	2023-09-13 13:24:15 +00:00
PyTorch MergeBot	38fcf77a1b	Revert "[dynamo] Add BACKEND_MATCH guard to detect and recompile when backend changes (#107337 )" This reverts commit `1a64ec7dd4`. Reverted https://github.com/pytorch/pytorch/pull/107337 on behalf of https://github.com/huydhn due to Sorry for reverting your change but inductor perf smoke test starts to regress after this ([comment](https://github.com/pytorch/pytorch/pull/107337#issuecomment-1710974588))	2023-09-08 02:03:48 +00:00

1 2 3 4

164 Commits