pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
rzou	3ef0befdc9	Better error messages for impl_abstract_pystub (#120959 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120959 Approved by: https://github.com/drisspg	2024-03-04 15:24:36 +00:00
Tugsbayasgalan Manlaibaatar	c646030cd2	Support higher order op functionalization in predispatch IR (#115314 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115314 Approved by: https://github.com/bdhirsh	2024-03-01 09:13:47 +00:00
Edward Z. Yang	46712b019d	Enable local_partial_types (#118467 ) When using dmypy, this setting is enabled and cannot be turned off. Force it for regular mypy too. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118467 Approved by: https://github.com/Skylion007 ghstack dependencies: #118414, #118418, #118432	2024-01-28 13:38:22 +00:00
Taras Tsugrii	2de3474711	Simplify kwargs propagation in __call__. (#117880 ) In case no keyword arguments are passed, `*kwargs` would expand just fine without the need for extra overhead of `or {}`. In addition to reducing boilerplate, this also comes with a small perf improvement: ``` In [1]: def null(args, *kwargs): ...: pass ...: In [2]: def call1(args, *kwargs): ...: return null(args, *(kwargs or {})) ...: In [3]: def call2(args, *kwargs): ...: return null(args, **kwargs) ...: In [4]: %timeit call1() 145 ns ± 2.07 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [5]: %timeit call2() 118 ns ± 2.14 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [6]: %timeit call1() 147 ns ± 6.19 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [7]: %timeit call2() 117 ns ± 0.846 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117880 Approved by: https://github.com/Skylion007	2024-01-20 19:29:35 +00:00
Tugsbayasgalan Manlaibaatar	76b1d44d57	pre_dispatch aot_export (#115188 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188 Approved by: https://github.com/bdhirsh	2023-12-25 04:51:21 +00:00
PyTorch MergeBot	0567f71ac6	Revert " pre_dispatch aot_export (#115188 )" This reverts commit `a267d67350`. Reverted https://github.com/pytorch/pytorch/pull/115188 on behalf of https://github.com/jeanschmidt due to sadly, it is required to revert this commit in order to revert https://github.com/pytorch/pytorch/pull/115454 ([comment](https://github.com/pytorch/pytorch/pull/115188#issuecomment-1866310014))	2023-12-21 14:03:18 +00:00
Tugsbayasgalan Manlaibaatar	a267d67350	pre_dispatch aot_export (#115188 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188 Approved by: https://github.com/bdhirsh	2023-12-20 21:36:25 +00:00
Tugsbayasgalan Manlaibaatar	d85314c95c	Support Predispatch functionalization (#113728 ) In this PR, we are implementing Functionalization on pre-dispatch graph. Today, every dispatch key except for Dispatchkey.Python has a dedicated mode stack in python. PreDispatch tracing relies on this behaviour by pushing ProxyTorchDispatchMode to Dispatchkey.PreDispatch mode stack and handle the dispatching logic in python. To make pre-dispatch functionalization work, we now need to push FunctionalTensorMode on DispatchKey.PreDispatch mode stack and make sure it runs before ProxyTorchDispatchMode. (this is very similar to how post-dispatch tracing work). Here are some design decisions we made for this flow to work: 1. FunctionalTensorMode internally calls C++ functionalize key. Since C++ functionalization goes after PreDispatch, if we are not careful, we will keep re-entering into PreDispatch key. We solve this by directly dispatching to C++ Functionalize key. 2. We delete mode_stack_per_key logic because the only realistic time it is exercised is for PreDispatch and it is in general not safe to have a plain list because FunctionalTensorMode and ProxyTorchDispatchMode ordering matter and it is hard to enforce it on plain list. Instead, now we have a private class that tracks PreDispatch mode stack. 3. We will still run CompositeImplicitAutograd decomps in this PR, and disable this logic later as a followup. Some missing bits after this PR: 1. Preserving autograd ops in a functional form. Right now they still show up in the graph but in a "non-functional" way. 2. Turn off CompositeImplicitAutograd decomps 3. Functionalizing HOO Pull Request resolved: https://github.com/pytorch/pytorch/pull/113728 Approved by: https://github.com/bdhirsh	2023-12-19 20:28:35 +00:00
kflu	c5dcb50c00	[easy] aten ops: support passing all args as kwargs, including `self` (#114920 ) Summary: This is important for writing aten IR based graph transformation. ``` In [4]: [x.name for x in torch.ops.aten.reshape.default._schema.arguments] Out[4]: ['self', 'shape'] In [8]: torch.ops.aten.reshape.default(torch.rand(1,2), shape=[2]) Out[8]: tensor([0.7584, 0.4834]) # === CANNOT CALL `self` BY KWARGS === In [7]: torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[7], line 1 ----> 1 torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2]) TypeError: OpOverload.__call__() got multiple values for argument 'self' ``` # Where's the problem? 1. the aten ops first arg is usually named `self` (aten/src/ATen/native/native_functions.yaml) 2. Unfortunately, in `torch._ops.{OpOverload, OpOverloadPacket}.__call__()`, the first arg is (by python convention) named `self` too. So when call `self` by kwargs, `OpOverloadPacket.__call__` received: ``` OpOverloadPacket.__call__(self, {"self": ...}) ``` It is Python that does not allow some argument named "arg" to appear twice. and hence > TypeError: OpOverload.__call__() got multiple values for argument 'self' # How to fix? Note that, in above, `self` is an instance of `OpOverloadPacket`, and the "self" kwarg is the input tensor to the aten op. To fix, we only need to differentiate the two `self`s. In Python, first arg of a method does not need to be named `self`. So we change the `__call__` definition to: ``` def __call__(_self, ...): ``` Now the call becomes: ``` OpOverloadPacket.__call__(_self, {"self": ...}) ``` where: * `_self` is the instance to the `OpOverloadPacket` * `"self"` is the input tensor to the aten op. Test Plan: ``` In [4]: [x.name for x in torch.ops.aten.reshape.default._schema.arguments] Out[4]: ['self', 'shape'] In [3]: torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2]) Out[3]: tensor([0.5127, 0.3051]) ``` Differential Revision: D51731996 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114920 Approved by: https://github.com/houseroad	2023-12-16 18:32:58 +00:00
rzou	cfa4370c07	torch.compile should auto-functionalize certain mutable ops (#114955 ) Users may wish to torch.compile custom ops that mutate their inputs and return nothing (this is a common class of operators). torch.compile will automatically support this op without anyone needing to provide a functionalization kernel for it. Here's how. Let's say we have a hypothetical mylib::sin_(Tensor(a!) x) -> () op. First, when FakeTensor sees this op, it can just return None. This is the case because custom ops are not allowed to mutate input metadata, so the FakeTensor rule for one that returns nothing is trivial. Next, when Python FunctionalTensor sees the op, it will functionalize it by emitting a call to an auto_functionalize(op, ["x"], {"x": ...}) HOP and replacing the mutated inputs with the outputs of this HOP. This HOP effectively runs the functional version of the op when called: it clones inputs that will be mutated, runs the op, and then returns Tensors with the new values. In the future we can teach Inductor how to do re-inplacing when it sees this HOP (like how triton kernels do it) but this isn't urgent (and is more of a performance problem). Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/114955 Approved by: https://github.com/bdhirsh	2023-12-05 14:53:08 +00:00
Xuehai Pan	55064a4ef9	[BE] add parentheses to kwargs unpacking `func(args, (kwargs or {}))` (#115026 ) This PR adds parentheses to kwargs unpacking `func(args, *(kwargs or {}))` for better code readability. With/without the parentheses are semantic equivalent because they produce the same bytecode. ```console $ echo "func(args, *kwargs or {})" \| python3 -m dis - 0 0 RESUME 0 1 2 PUSH_NULL 4 LOAD_NAME 0 (func) 6 LOAD_NAME 1 (args) 8 BUILD_MAP 0 10 LOAD_NAME 2 (kwargs) 12 JUMP_IF_TRUE_OR_POP 1 (to 16) 14 BUILD_MAP 0 >> 16 DICT_MERGE 1 18 CALL_FUNCTION_EX 1 20 POP_TOP 22 LOAD_CONST 0 (None) 24 RETURN_VALUE $ echo "func(args, **(kwargs or {}))" \| python3 -m dis - 0 0 RESUME 0 1 2 PUSH_NULL 4 LOAD_NAME 0 (func) 6 LOAD_NAME 1 (args) 8 BUILD_MAP 0 10 LOAD_NAME 2 (kwargs) 12 JUMP_IF_TRUE_OR_POP 1 (to 16) 14 BUILD_MAP 0 >> 16 DICT_MERGE 1 18 CALL_FUNCTION_EX 1 20 POP_TOP 22 LOAD_CONST 0 (None) 24 RETURN_VALUE ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115026 Approved by: https://github.com/Skylion007	2023-12-03 20:03:26 +00:00
ydwu4	4182092feb	[reland][HigherOrderOp] remove _deprecated_global_ns (#113813 ) This is a reland of #112757. Cannot land original one internally because internal diff is not in sync with OSS due to issues in dealing with two export repos (executorch and pytorch) using the ghimport-ghexport approach. Will try the web UI of import and export instead of ghimport and ghexport flow. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113813 Approved by: https://github.com/angelayi	2023-11-20 23:16:18 +00:00
Richard Zou	d1c092ae1b	Update impl_abstract_pystub to be less boilerplatey (#113182 ) Summary: We've made the following changes: - The new way to use the API is `m.impl_abstract_pystub(module, context)`. Every subsequent m.def of an op inside the TORCH_LIBRARY block gives the op the `impl_abstract_pystub`. - Added a mechanism to determine if an operator was defined in Python or C++. Library.define in Python appends the op to a global set, which is analogous to what we do for tracking Library.impl. - If someone does `torch.library.impl_abstract` in Python for an operator, then we require that it has an `impl_abstract_pystub` specified and we also check that the module in the `impl_abstract_pystub` is the same as the module where the call to `torch.library.impl_abstract` exists. - Unfortunately we can't check the "context" (which is the buck target on buck-based systems) because buck sits above us. bypass-github-export-checks Test Plan: - existing tests Differential Revision: D51080493 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113182 Approved by: https://github.com/ezyang	2023-11-08 00:39:00 +00:00
PyTorch MergeBot	bc3e2e03cd	Revert "Update impl_abstract_pystub to be less boilerplatey (#112851 )" This reverts commit `6ae4e3a8d2`. Reverted https://github.com/pytorch/pytorch/pull/112851 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/112851#issuecomment-1799539354))	2023-11-07 18:53:13 +00:00
Richard Zou	6ae4e3a8d2	Update impl_abstract_pystub to be less boilerplatey (#112851 ) Summary: We've made the following changes: - The new way to use the API is `m.impl_abstract_pystub(module, context)`. Every subsequent m.def of an op inside the TORCH_LIBRARY block gives the op the `impl_abstract_pystub`. - Added a mechanism to determine if an operator was defined in Python or C++. Library.define in Python appends the op to a global set, which is analogous to what we do for tracking Library.impl. - If someone does `torch.library.impl_abstract` in Python for an operator, then we require that it has an `impl_abstract_pystub` specified and we also check that the module in the `impl_abstract_pystub` is the same as the module where the call to `torch.library.impl_abstract` exists. - Unfortunately we can't check the "context" (which is the buck target on buck-based systems) because buck sits above us. Test Plan: - existing tests Differential Revision: D50972148 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112851 Approved by: https://github.com/ezyang	2023-11-07 16:07:42 +00:00
PyTorch MergeBot	77d5f0379e	Revert "[HigherOrderOp] remove _deprecated_global_ns (#112757 )" This reverts commit `fa81237af7`. Reverted https://github.com/pytorch/pytorch/pull/112757 on behalf of https://github.com/PaliC due to breaking a bunch of executorch tests ([comment](https://github.com/pytorch/pytorch/pull/112757#issuecomment-1795503740))	2023-11-06 17:04:19 +00:00
ydwu4	fa81237af7	[HigherOrderOp] remove _deprecated_global_ns (#112757 ) As titled. Test Plan: existing test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112757 Approved by: https://github.com/zou3519	2023-11-03 23:03:18 +00:00
Peter Bell	66c32d099a	Use `pytree.arg_tree_leaves` everywhere (#112394 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112394 Approved by: https://github.com/lezcano ghstack dependencies: #112391, #112392, #112393	2023-10-31 15:57:06 +00:00
Peter Bell	bbd5b935e4	Use `pytree.tree_leaves` everywhere (#112324 ) This changes all the instances I could find of `tree_flatten(...)[0]` or `x, _ = tree_flatten` to use `tree_leaves`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324 Approved by: https://github.com/lezcano ghstack dependencies: #112327, #112323	2023-10-30 03:39:04 +00:00
Dino Viehland	5b71834785	Avoid c++ exception and stack trace (#111438 ) Summary: When raising an exception here this causes pybind11's dispatcher to kick in, which causes aiplatform's logic to kick in (aiplatform::error_reporting::util::printAddressesWithBestEffortLocationInfo), which ultimately uses `folly::symbolizer::Symbolizer::symbolize` for building up the stack trace. In 3.8 this uses about 3.62% of the CPU time per pyperf (https://fburl.com/scuba/pyperf_experimental/on_demand/oi554uvy). In Cinder 3.8 for some reason this is worse - using 5.94% of the CPU. This exception is happening when doing a hasattr() on `prims` for things like `bitwise_left_shift` which don't exist: https://www.internalfb.com/code/fbsource/[2d695f650d00]/fbcode/caffe2/torch/_inductor/lowering.py?lines=590 That exception is ultimately going to be swallowed anyway, and the stack trace has no meaningful value. Furthermore because this is kind of an expected outcome in the code versus some random C++ exception the stack trace is less valuable as well. This changes this to return a (None, None) on the failure case instead of returning a valid op/overload list, avoiding the exception, and reclaiming the 3.62%-5.94% of time. Test Plan: Existing CI and perf run: https://fburl.com/scuba/pyperf_experimental/on_demand/oi554uvy Differential Revision: D50018789 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111438 Approved by: https://github.com/davidberard98	2023-10-26 23:55:34 +00:00
Kazuaki Ishizaki	b5f9696d81	Fix typo under torch directory (#110824 ) This PR fixes typo `the the` of comments and exception messages in files under `torch` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110824 Approved by: https://github.com/H-Huang	2023-10-09 19:16:43 +00:00
ydwu4	cc1de49340	[HigherOrderOp] fallthrough some keys by default. (#110478 ) Fixes #109253 Test Plan: Added a new test that shows default fallthrough keys can be overrided. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110478 Approved by: https://github.com/ezyang	2023-10-05 16:25:42 +00:00
rzou	774137d506	Add torch.ops.import_module (#110090 ) Generally, to extend PyTorch with custom operators, a user will create a Python module whose import triggers registration of the custom operators via a torch.ops.load_library call or a call to one or more torch.library.* APIs. It is unexpected for Python modules to have side effects, so some linters and formatters will complain. Use torch.ops.import_module to import the module without a linter or formatter complaining. NB: A more robust API would actually check if a custom op was registered or modified, but this is technically challenging to do. In the future we can add a warning if a custom op wasn't registered or modified. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/110090 Approved by: https://github.com/ezyang	2023-09-27 13:56:47 +00:00
Brian Hirsh	238fb66085	python functionalization: support higher order ops (#108656 ) We now have two types of functionalization, C++ Functionalization (through the `Functionalize` dispatch key), and python functionalization (through the `FunctionalTensorMode` torch_dispatch mode). This means that all higher order ops need custom functionalization rules for the python variant too. I added them here, as well as a helper function `dispatch_functionalize()` - equivalent to `torch.func.functionalize()`, except that it uses `FunctionalTensorMode`. In theory we could have secretly switched `torch.func.functionalize` to use `FunctionalTensorMode`. This would be BC-breaking, though, since `FunctionalTensorMode` isn't composable with the other functorch transforms (the functorch layer-mode stack doesn't know how to re-order torch_dispatch modes arbitrarily). Pull Request resolved: https://github.com/pytorch/pytorch/pull/108656 Approved by: https://github.com/zou3519 ghstack dependencies: #109024, #109248	2023-09-20 04:37:31 +00:00
Yanbo Liang	8a567bb59d	[HigherOrderOp] Should automatically pop modes (#109157 ) Fixes #108282 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157 Approved by: https://github.com/zou3519	2023-09-18 20:54:09 +00:00
PyTorch MergeBot	07f2efa285	Revert "[HigherOrderOp] Should automatically pop modes (#109157 )" This reverts commit `f03b8abd47`. Reverted https://github.com/pytorch/pytorch/pull/109157 on behalf of https://github.com/clee2000 due to broke internal builds D49346922 ([comment](https://github.com/pytorch/pytorch/pull/109157#issuecomment-1722571262))	2023-09-17 21:19:52 +00:00
Yanbo Liang	f03b8abd47	[HigherOrderOp] Should automatically pop modes (#109157 ) Fixes #108282 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157 Approved by: https://github.com/zou3519	2023-09-14 20:46:26 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	2b7271c703	Support cond and out_dtype for predispatch (#107941 ) Summary: Title Test Plan: CI Differential Revision: D48675742 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107941 Approved by: https://github.com/jerryzh168	2023-08-25 17:37:16 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Justin Chu	79c5e33349	[BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436 Approved by: https://github.com/malfet, https://github.com/albanD	2023-07-21 07:38:46 +00:00
Animesh Jain	735e6ae801	[dynamo] Maintainable code - Move decorators in a separate file (#105070 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105070 Approved by: https://github.com/ezyang	2023-07-13 07:41:19 +00:00
ydwu4	6a3d5f1986	[HigherOrderOp] Remove _deprecated_global_ns from cond (#104380 ) Remove _deprecated_global_ns from cond following #104105. We change the module attribute of HigherOrderOperator instances in the constructor from torch.ops to torch.ops.higher_order when self.namespace is "higher_order". For subclasses (e.g. customized higher order operator), we leave their \_\_module\_\_ unchanged. Will import this PR to fix internal tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104380 Approved by: https://github.com/zhxchen17, https://github.com/zou3519	2023-07-07 17:13:09 +00:00
Tarun Karuturi	6d2da6106d	Raise AttributeError in _OpsNamespace if __self__ attribute is requested (#104096 ) Summary: Trying to get the `__self__` attribute on any `_OpNamespace` object should be an invalid operation. The `__self__` attribute only exists on instance method object and not on class objects. In [dynamo](`a152b3e3b8/torch/_dynamo/variables/torch.py (L164)`) there is code that tries to access the `__self__` attribute on `TorchVariable`, this currently results in an expensive call to `torch._C._jit_get_operation` [here](`a152b3e3b8/torch/_ops.py (L740)`) which ultimately fails and throws an exception. For cases where it fails the operation turns out to be quite expensive on the order of ~0.03s. For edge use cases when exporting large models with quantized ops this exception is thrown 100's of times resulting in a lot of time wasted. By preventing the call to `torch._C._jit_get_operation` we can quickly return from this function and significantly reduce export times. On a large ASR model for example export currently takes ~405 seconds. With this change we can reduce it to ~340s. Overall this should also be a harmless change as no one should mostly ever try to access the `__self__` attribute on any `_OpNamespace` object. Test Plan: Added test case. Differential Revision: D46959879 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104096 Approved by: https://github.com/larryliu0820, https://github.com/ezyang, https://github.com/zou3519	2023-06-27 01:42:06 +00:00
rzou	036cda415f	Change HigherOrderOperator default namespace from global to 'higher_order' (#103870 ) This PR changes the default namespace for higher order operators from the global namespace (e.g. torch.ops.cond) to `higher_order` (e.g. torch.ops.higher_order.cond). We don't actually change the namespace for existing HigherOrderOperators. The motivation is to stem the bleeding; exposing operators into the global namespace is a bad idea due to name collision with other user-defined namespaces. We will go in and fix the `_deprecated_global_ns` as necessary after this diff. Differential Revision: [D46809738](https://our.internmc.facebook.com/intern/diff/D46809738/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103870 Approved by: https://github.com/ydwu4	2023-06-20 19:10:55 +00:00
Tugsbayasgalan Manlaibaatar	d4b85f3031	Support params/buffers inside cond and map (#102310 ) With #102022, params and buffers are always treated as special case of free variables. In this PR, I switch cond and map implementation to the this method and deprecate the old tracing mechanism. Differential Revision: [D46746202](https://our.internmc.facebook.com/intern/diff/D46746202) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102310 Approved by: https://github.com/avikchaudhuri, https://github.com/zou3519	2023-06-20 05:33:10 +00:00
PyTorch MergeBot	2087d32811	Revert "Support params/buffers inside cond and map (#102310 )" This reverts commit `766f236bad`. Reverted https://github.com/pytorch/pytorch/pull/102310 on behalf of https://github.com/huydhn due to The test is failing in trunk `766f236bad` ([comment](https://github.com/pytorch/pytorch/pull/102310#issuecomment-1592159710))	2023-06-15 00:29:20 +00:00
Tugsbayasgalan Manlaibaatar	766f236bad	Support params/buffers inside cond and map (#102310 ) With #102022, params and buffers are always treated as special case of free variables. In this PR, I switch cond and map implementation to the this method and deprecate the old tracing mechanism. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102310 Approved by: https://github.com/avikchaudhuri, https://github.com/zou3519	2023-06-14 22:32:33 +00:00
Animesh Jain	58d2c66a70	[activation checkpointing] Higher order functional rng op wrappers (#102934 ) Introduces two higher order operators * run_and_save_rng_state - Saves the current rng state and then runs the op. * run_with_rng_state - Runs the op with the rng state supplied as an input Ideally, we would like to use torch.compile for these operators. But currently the plan is to introduce these operators at the partitioner level, obviating the need to support them fully through the torch.compile stack. To ensure that we have good enough debugging with minifiers, we have ensure that they work with make_fx. In future, we can move on torch.compile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102934 Approved by: https://github.com/jansel, https://github.com/zou3519	2023-06-12 22:54:17 +00:00
PyTorch MergeBot	d1f24f73da	Revert "Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108 )" This reverts commit `194262ee49`. Reverted https://github.com/pytorch/pytorch/pull/103108 on behalf of https://github.com/izaitsevfb due to Breaks executorch internally, see D46581996 ([comment](https://github.com/pytorch/pytorch/pull/103108#issuecomment-1585041505))	2023-06-09 19:31:40 +00:00
Richard Zou	194262ee49	Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108 ) Previously, defining a HigherOrderOperators (like cond) automatically generates a torch.ops.cond and causes them to trace into the FX graph as e.g. torch.ops.cond. This is not good, because: - Duplication. Since HigherOrderOperators are written in Python, they have an associated Python function that users should access them from. E.g. torch.cond (when we make it public). That is what should actually appear in the graph. - torch.ops.cond is a valid namespace for operator registration; having it be a function too confuses things. This PR: - Moves cond/map HigherOrderOperators to be under torch (necessary for the FX logic to not do weird things) - Sets the `__module__` of a HigherOrderOperator correct. This is what FX uses when tracing the operator. Test Plan: - updated tests Future: - I'll delete the ability to call cond as torch.ops.cond in a couple of days, after this change circulates internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103108 Approved by: https://github.com/ydwu4	2023-06-08 01:55:27 +00:00
albanD	59dff01319	Add top level function to check if running with deploy (#101420 ) Also not sure if this should be a public function or not. Leaving it private for now but let me know if you prefer for it to be public. FYI @nikitaved this will logically conflict with your triton kernel PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101420 Approved by: https://github.com/malfet	2023-05-16 16:05:49 +00:00
PyTorch MergeBot	58f796ff5d	Revert "Initial version of Dynamo capture for HigherOrderOperator (#99988 )" This reverts commit `4c99f9cdf2`. Reverted https://github.com/pytorch/pytorch/pull/99988 on behalf of https://github.com/atalman due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/99988#issuecomment-1533081452))	2023-05-03 14:02:40 +00:00
Richard Zou	4c99f9cdf2	Initial version of Dynamo capture for HigherOrderOperator (#99988 ) This PR introduces a `wrap(body_fn, args)` higher order operator The semantics of `wrap(body_fn, args)` is to just run `body_fn(args)` Underneath Dynamo, this PR makes it so that we rewrite calls to `wrap(body_fn, args)` with `wrap(new_fn, *new_args)` where `new_fn` has no free variables. This PR does not update cond/map to use the new mechanism yet (we do not support nn.Modues yet, will come in the future). The design we take is: - OutputGraph represents the graph being built by Dynamo that may be compiled and executed. - OutputGraph owns a root SubgraphTracer, where it builds the FX graph. - OutputGraph may own multiple nested SubgraphTracers. - When we need to trace the body function of a HigherOrderOperator, we construct a new SubgraphTracer to build the graph of the body function. Mechanically, when Dynamo sees a new `wrap` HigherOrderOperator with a body function, it: - Creates a new SubgraphTracer via OutputGraph.new_subtracer - Executes the body function This captures the body function into the graph on the new SubgraphTracer while modifying the state of the OutputGraph. For example, the OutputGraph may receive new GraphArgs, new guards, and new side effects. If capture of the body function fails, then Dynamo graph breaks on the HigherOrderOperator. Test Plan: - added test/dynamo/test_higher_order_ops.py Future: - We're not actually able to tell Dynamo to completely graph break on the HigherOrderOperator. Instead, when we do graph break, Dynamo begins introspecting `HigherOrderOperator.__call__`. It should probably not do this. - Ideally we would error out on new SideEffects. I don't know how to do this yet. - We don't support dealing with nn.Modules yet (e.g. calling nn.Modules or accessing attributes of tracked nn.Modules from a body_fn). There's an open question on what should actually happen here - Ideally we would rewrite map/cond to use the new mechanism but we need to fix the previous bullet point before we can get there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99988 Approved by: https://github.com/voznesenskym, https://github.com/anijain2305	2023-05-02 17:11:02 +00:00
Richard Zou	f21a176c03	Python Dispatcher should respect FuncTorchBatchedDecomposition key (#98328 ) Fixes https://github.com/pytorch/pytorch/issues/97425. Python Dispatcher's resolve_key function should be equivalent to computeDispatchTableEntryWithDebug. We added a section to computeDispatchTableEntryWithDebug but forgot to add it to resolve_key. This PR fixes that discrepancy. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/98328 Approved by: https://github.com/Chillee, https://github.com/kshitij12345, https://github.com/Neilblaze	2023-04-05 20:32:53 +00:00
Edward Z. Yang	fa4c77e39b	Rename PyOperator to HigherOrderOperator (#97493 ) Twice this week I have had people confuse "operator defined with Python operator registration aka torch.library" and "PyOperator which is used to define control flow operators and other operators that cannot be represented in JIT schema." Renaming PyOperator for clarity. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97493 Approved by: https://github.com/SherlockNoMad	2023-03-24 05:04:02 +00:00
Brian Hirsh	af440c427b	[draft for discussion] add per-dispatch key modes (#97052 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97052 Approved by: https://github.com/ezyang, https://github.com/zou3519	2023-03-21 23:45:45 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00

1 2 3

117 Commits