pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	07f2efa285	Revert "[HigherOrderOp] Should automatically pop modes (#109157 )" This reverts commit `f03b8abd47`. Reverted https://github.com/pytorch/pytorch/pull/109157 on behalf of https://github.com/clee2000 due to broke internal builds D49346922 ([comment](https://github.com/pytorch/pytorch/pull/109157#issuecomment-1722571262))	2023-09-17 21:19:52 +00:00
Yanbo Liang	f03b8abd47	[HigherOrderOp] Should automatically pop modes (#109157 ) Fixes #108282 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157 Approved by: https://github.com/zou3519	2023-09-14 20:46:26 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	2b7271c703	Support cond and out_dtype for predispatch (#107941 ) Summary: Title Test Plan: CI Differential Revision: D48675742 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107941 Approved by: https://github.com/jerryzh168	2023-08-25 17:37:16 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Justin Chu	79c5e33349	[BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436 Approved by: https://github.com/malfet, https://github.com/albanD	2023-07-21 07:38:46 +00:00
Animesh Jain	735e6ae801	[dynamo] Maintainable code - Move decorators in a separate file (#105070 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105070 Approved by: https://github.com/ezyang	2023-07-13 07:41:19 +00:00
ydwu4	6a3d5f1986	[HigherOrderOp] Remove _deprecated_global_ns from cond (#104380 ) Remove _deprecated_global_ns from cond following #104105. We change the module attribute of HigherOrderOperator instances in the constructor from torch.ops to torch.ops.higher_order when self.namespace is "higher_order". For subclasses (e.g. customized higher order operator), we leave their \_\_module\_\_ unchanged. Will import this PR to fix internal tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104380 Approved by: https://github.com/zhxchen17, https://github.com/zou3519	2023-07-07 17:13:09 +00:00
Tarun Karuturi	6d2da6106d	Raise AttributeError in _OpsNamespace if __self__ attribute is requested (#104096 ) Summary: Trying to get the `__self__` attribute on any `_OpNamespace` object should be an invalid operation. The `__self__` attribute only exists on instance method object and not on class objects. In [dynamo](`a152b3e3b8/torch/_dynamo/variables/torch.py (L164)`) there is code that tries to access the `__self__` attribute on `TorchVariable`, this currently results in an expensive call to `torch._C._jit_get_operation` [here](`a152b3e3b8/torch/_ops.py (L740)`) which ultimately fails and throws an exception. For cases where it fails the operation turns out to be quite expensive on the order of ~0.03s. For edge use cases when exporting large models with quantized ops this exception is thrown 100's of times resulting in a lot of time wasted. By preventing the call to `torch._C._jit_get_operation` we can quickly return from this function and significantly reduce export times. On a large ASR model for example export currently takes ~405 seconds. With this change we can reduce it to ~340s. Overall this should also be a harmless change as no one should mostly ever try to access the `__self__` attribute on any `_OpNamespace` object. Test Plan: Added test case. Differential Revision: D46959879 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104096 Approved by: https://github.com/larryliu0820, https://github.com/ezyang, https://github.com/zou3519	2023-06-27 01:42:06 +00:00
rzou	036cda415f	Change HigherOrderOperator default namespace from global to 'higher_order' (#103870 ) This PR changes the default namespace for higher order operators from the global namespace (e.g. torch.ops.cond) to `higher_order` (e.g. torch.ops.higher_order.cond). We don't actually change the namespace for existing HigherOrderOperators. The motivation is to stem the bleeding; exposing operators into the global namespace is a bad idea due to name collision with other user-defined namespaces. We will go in and fix the `_deprecated_global_ns` as necessary after this diff. Differential Revision: [D46809738](https://our.internmc.facebook.com/intern/diff/D46809738/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103870 Approved by: https://github.com/ydwu4	2023-06-20 19:10:55 +00:00
Tugsbayasgalan Manlaibaatar	d4b85f3031	Support params/buffers inside cond and map (#102310 ) With #102022, params and buffers are always treated as special case of free variables. In this PR, I switch cond and map implementation to the this method and deprecate the old tracing mechanism. Differential Revision: [D46746202](https://our.internmc.facebook.com/intern/diff/D46746202) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102310 Approved by: https://github.com/avikchaudhuri, https://github.com/zou3519	2023-06-20 05:33:10 +00:00
PyTorch MergeBot	2087d32811	Revert "Support params/buffers inside cond and map (#102310 )" This reverts commit `766f236bad`. Reverted https://github.com/pytorch/pytorch/pull/102310 on behalf of https://github.com/huydhn due to The test is failing in trunk `766f236bad` ([comment](https://github.com/pytorch/pytorch/pull/102310#issuecomment-1592159710))	2023-06-15 00:29:20 +00:00
Tugsbayasgalan Manlaibaatar	766f236bad	Support params/buffers inside cond and map (#102310 ) With #102022, params and buffers are always treated as special case of free variables. In this PR, I switch cond and map implementation to the this method and deprecate the old tracing mechanism. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102310 Approved by: https://github.com/avikchaudhuri, https://github.com/zou3519	2023-06-14 22:32:33 +00:00
Animesh Jain	58d2c66a70	[activation checkpointing] Higher order functional rng op wrappers (#102934 ) Introduces two higher order operators * run_and_save_rng_state - Saves the current rng state and then runs the op. * run_with_rng_state - Runs the op with the rng state supplied as an input Ideally, we would like to use torch.compile for these operators. But currently the plan is to introduce these operators at the partitioner level, obviating the need to support them fully through the torch.compile stack. To ensure that we have good enough debugging with minifiers, we have ensure that they work with make_fx. In future, we can move on torch.compile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102934 Approved by: https://github.com/jansel, https://github.com/zou3519	2023-06-12 22:54:17 +00:00
PyTorch MergeBot	d1f24f73da	Revert "Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108 )" This reverts commit `194262ee49`. Reverted https://github.com/pytorch/pytorch/pull/103108 on behalf of https://github.com/izaitsevfb due to Breaks executorch internally, see D46581996 ([comment](https://github.com/pytorch/pytorch/pull/103108#issuecomment-1585041505))	2023-06-09 19:31:40 +00:00
Richard Zou	194262ee49	Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108 ) Previously, defining a HigherOrderOperators (like cond) automatically generates a torch.ops.cond and causes them to trace into the FX graph as e.g. torch.ops.cond. This is not good, because: - Duplication. Since HigherOrderOperators are written in Python, they have an associated Python function that users should access them from. E.g. torch.cond (when we make it public). That is what should actually appear in the graph. - torch.ops.cond is a valid namespace for operator registration; having it be a function too confuses things. This PR: - Moves cond/map HigherOrderOperators to be under torch (necessary for the FX logic to not do weird things) - Sets the `__module__` of a HigherOrderOperator correct. This is what FX uses when tracing the operator. Test Plan: - updated tests Future: - I'll delete the ability to call cond as torch.ops.cond in a couple of days, after this change circulates internally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103108 Approved by: https://github.com/ydwu4	2023-06-08 01:55:27 +00:00
albanD	59dff01319	Add top level function to check if running with deploy (#101420 ) Also not sure if this should be a public function or not. Leaving it private for now but let me know if you prefer for it to be public. FYI @nikitaved this will logically conflict with your triton kernel PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101420 Approved by: https://github.com/malfet	2023-05-16 16:05:49 +00:00
PyTorch MergeBot	58f796ff5d	Revert "Initial version of Dynamo capture for HigherOrderOperator (#99988 )" This reverts commit `4c99f9cdf2`. Reverted https://github.com/pytorch/pytorch/pull/99988 on behalf of https://github.com/atalman due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/99988#issuecomment-1533081452))	2023-05-03 14:02:40 +00:00
Richard Zou	4c99f9cdf2	Initial version of Dynamo capture for HigherOrderOperator (#99988 ) This PR introduces a `wrap(body_fn, args)` higher order operator The semantics of `wrap(body_fn, args)` is to just run `body_fn(args)` Underneath Dynamo, this PR makes it so that we rewrite calls to `wrap(body_fn, args)` with `wrap(new_fn, *new_args)` where `new_fn` has no free variables. This PR does not update cond/map to use the new mechanism yet (we do not support nn.Modues yet, will come in the future). The design we take is: - OutputGraph represents the graph being built by Dynamo that may be compiled and executed. - OutputGraph owns a root SubgraphTracer, where it builds the FX graph. - OutputGraph may own multiple nested SubgraphTracers. - When we need to trace the body function of a HigherOrderOperator, we construct a new SubgraphTracer to build the graph of the body function. Mechanically, when Dynamo sees a new `wrap` HigherOrderOperator with a body function, it: - Creates a new SubgraphTracer via OutputGraph.new_subtracer - Executes the body function This captures the body function into the graph on the new SubgraphTracer while modifying the state of the OutputGraph. For example, the OutputGraph may receive new GraphArgs, new guards, and new side effects. If capture of the body function fails, then Dynamo graph breaks on the HigherOrderOperator. Test Plan: - added test/dynamo/test_higher_order_ops.py Future: - We're not actually able to tell Dynamo to completely graph break on the HigherOrderOperator. Instead, when we do graph break, Dynamo begins introspecting `HigherOrderOperator.__call__`. It should probably not do this. - Ideally we would error out on new SideEffects. I don't know how to do this yet. - We don't support dealing with nn.Modules yet (e.g. calling nn.Modules or accessing attributes of tracked nn.Modules from a body_fn). There's an open question on what should actually happen here - Ideally we would rewrite map/cond to use the new mechanism but we need to fix the previous bullet point before we can get there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99988 Approved by: https://github.com/voznesenskym, https://github.com/anijain2305	2023-05-02 17:11:02 +00:00
Richard Zou	f21a176c03	Python Dispatcher should respect FuncTorchBatchedDecomposition key (#98328 ) Fixes https://github.com/pytorch/pytorch/issues/97425. Python Dispatcher's resolve_key function should be equivalent to computeDispatchTableEntryWithDebug. We added a section to computeDispatchTableEntryWithDebug but forgot to add it to resolve_key. This PR fixes that discrepancy. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/98328 Approved by: https://github.com/Chillee, https://github.com/kshitij12345, https://github.com/Neilblaze	2023-04-05 20:32:53 +00:00
Edward Z. Yang	fa4c77e39b	Rename PyOperator to HigherOrderOperator (#97493 ) Twice this week I have had people confuse "operator defined with Python operator registration aka torch.library" and "PyOperator which is used to define control flow operators and other operators that cannot be represented in JIT schema." Renaming PyOperator for clarity. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97493 Approved by: https://github.com/SherlockNoMad	2023-03-24 05:04:02 +00:00
Brian Hirsh	af440c427b	[draft for discussion] add per-dispatch key modes (#97052 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97052 Approved by: https://github.com/ezyang, https://github.com/zou3519	2023-03-21 23:45:45 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Edward Z. Yang	6a675f7cac	Correctly resolve dispatch keys for PyOperator (#96306 ) Previously, we never actually used resolve_key, which meant that you had to register CPU/CUDA/etc all manually; none of the alias keys worked. Now they work. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96306 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2023-03-09 22:16:31 +00:00
Edward Z. Yang	32ffd70644	Rewrite fallthrough to more closely match how C++ works (#96304 ) Fallthrough is modeled as a mask which we use to remove keys from the compute dispatch key set for eligibility. It's possible this addresses https://github.com/pytorch/pytorch/issues/89037 in a better way than https://github.com/pytorch/pytorch/pull/95891 but I cannot easily tell as the original repro no longer works and the new PR does not have a test. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96304 Approved by: https://github.com/zou3519, https://github.com/albanD, https://github.com/zhxchen17	2023-03-08 23:00:26 +00:00
Edward Z. Yang	67c329bc9b	Refactor to reduce duplicate logic in torch._ops (#96302 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96302 Approved by: https://github.com/zou3519	2023-03-08 23:00:26 +00:00
Nikita Shulga	941ff109d3	`dl_open_guard` should restore flag even after exception (#96231 ) I.e. follow pattern outlined in https://docs.python.org/3.8/library/contextlib.html#contextlib.contextmanager Also, return early on non-unix platforms (when `sys.getdlopenflags` is not defined) Fixes https://github.com/pytorch/pytorch/issues/96159 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96231 Approved by: https://github.com/atalman	2023-03-08 06:01:27 +00:00
Xuehai Pan	5b1cedacde	[BE] [2/3] Rewrite `super()` calls in functorch and torch (#94588 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-10 21:16:33 +00:00
Angela Yi	5fdddbbfe8	Fix checking of current mode in PyOperator dispatch (#92357 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92357 Approved by: https://github.com/voznesenskym	2023-01-18 23:08:36 +00:00
Richard Zou	da42eab48b	Fix circular import in torch/autograd/function.py (#90415 ) It turns out it is possible to break cycles by not directly importing a module: - there's a problem that torch.jit imports torch._ops and torch._ops import torch.jit - there's another problem that torch.autograd.function imports custom_function_call but torch._functorch.autograd_function imports torch.autograd.function The "better" way to handle all of this is to do some large refactoring so that torch._functorch.autograd_function imports some file that has _SingleLevelAutogradFunction and then have torch.autograd.function depend on torch.functorch.autograd_function... (and ditto for torch.jit vs torch._ops), but I'm scared to move code around too much for BC reasons and the fix in this PR works well. Test Plan: - import torch Pull Request resolved: https://github.com/pytorch/pytorch/pull/90415 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-12-14 16:20:57 +00:00
Edward Z. Yang	5266953443	Add crossref debug mode for functionalization, catches stride errors (#89498 ) The idea is to add a custom handler to Functionalize key in Python dispatcher that runs the functionalized version along side a non functionalized version, and checks that their outputs agree in the end. (Technically, for metadata mutation we should also check the inputs, but for now we're relying on those functions returning self.) I turned this on for test_functionalize.py (new TestCrossRefFunctionalize) and found a bunch of failures that look legit. This probably doesn't interact that nicely if you're also tracing at the same time, probably need more special logic for that (directly, just disabling tracing for when we create the nested fake tensor mode, but IDK if there's a more principled way to organize this.) There are some misc fixups which I can split if people really want. - xfail_inherited_tests moved to test common_utils - Bindings for _dispatch_tls_set_dispatch_key_included, _dispatch_tls_is_dispatch_key_included and _functionalization_reapply_views_tls - Type stubs for _enable_functionalization, _disable_functionalization - all_known_overloads utility to let you iterate over all OpOverloads in all namespaces. Iterator support on all torch._ops objects to let you iterate over their members. - suspend_functionalization lets you temporarily disable functionalization mode in a context - check_metadata_matches for easily comparing outputs of functions and see if they match (TODO: there are a few copies of this logic, consolidate!) - _fmt for easily printing the metadata of a tensor without its data - _uncache_dispatch for removing a particular dispatch key from the cache, so that we force it to regenerate - check_significant_strides new kwarg only_cuda to let you also do stride test even when inputs are not CUDA - Functionalize in torch._C.DispatchKey Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89498 Approved by: https://github.com/malfet	2022-11-23 04:18:25 +00:00
Edward Z. Yang	5582001bd5	Reland 2 "Towards unifying symbolic and non symbolic fake tensor (#89038 ) (#89143 )" (#89346 ) This reverts commit `8e4c9828f4`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89346 Approved by: https://github.com/wconstab	2022-11-19 21:14:31 +00:00
PyTorch MergeBot	8e4c9828f4	Revert "Reland "Towards unifying symbolic and non symbolic fake tensor (#89038 )" (#89143 )" This reverts commit `e686b8c3ba`. Reverted https://github.com/pytorch/pytorch/pull/89143 on behalf of https://github.com/ZainRizvi due to This seems to be causing the test_make_fx_symbolic_exhaustive_rad2deg_cpu_float32 and test_make_fx_symbolic_exhaustive_inplace_rad2deg_cpu_float32 test to fail across multiple jobs	2022-11-17 17:02:36 +00:00
Edward Z. Yang	e686b8c3ba	Reland "Towards unifying symbolic and non symbolic fake tensor (#89038 )" (#89143 ) This reverts commit `cf6003f046`. Differential Revision: [D41363992](https://our.internmc.facebook.com/intern/diff/D41363992) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89143 Approved by: https://github.com/albanD	2022-11-17 13:55:06 +00:00
PyTorch MergeBot	cf6003f046	Revert "Towards unifying symbolic and non symbolic fake tensor (#89038 )" This reverts commit `37d54239c7`. Reverted https://github.com/pytorch/pytorch/pull/89038 on behalf of https://github.com/ezyang due to executorch segfaults	2022-11-16 16:52:47 +00:00
Edward Z. Yang	37d54239c7	Towards unifying symbolic and non symbolic fake tensor (#89038 ) Fake tensor behaves pretty differently depending on if you have symbolic shapes or not. This leads to bugs; for example, we weren't getting correct convolution_backward strides because we bypassed the correct stride logic in fake tensor on symbolic shapes. This PR attempts to unify the two codepaths. I don't manage to unify everything, but I get most of it. The algorithm is delicate and I'm still hosing down test failures. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89038 Approved by: https://github.com/anjali411	2022-11-16 14:02:43 +00:00
Richard Zou	3bc327993f	PyDispatcher integration with functorch (#88785 ) This PR teaches PyDispatcher and PyOperator about functorch transforms. It is important that PyDispatcher/PyOperator dispatch with functorch transforms, because this is our plan for higher-order operators (operators that accept functions as arguments). Examples of these include: - functorch transforms over the existing cond operator (control flow) - autograd.Function support for functorch (which I am working towards), - AOTDispatcher (should be a higher order operator) Concretely, the problem with teaching PyDispatcher/PyOperator about functorch is that the stack-based dispatching logic (DynamicLayerStack) is hidden inside the fallbacks for two dispatch keys (DynamicLayer{Front, Back}). PyDispatcher doesn't know about C++ boxed fallbacks, our plan on record for that is that we need to reimplement all of them in Python (but can call helper functions in C++ to make our lives easier). Instead of exposing all of what DynamicLayer{Front, Back} do to python, this PR takes the approach of re-implementing part of the stack-based dispatching in Python. The motivation is that this is more sane and follows what the "ideal" implementation of functorch would have been: - each transform should be a "mode" - there should be no TLS dispatch key set hackery. functorch needs to do this hackery today to re-use VariableType implementations. This PR: - exposes the DynamicLayerStack to Python - The DynamicLayerStack is a stack of Interpreters. These get exposed to Python as well. - Interpreters can run operations (Interpreter.process) or lower them to the next interpreter in the stack (Interpreter.lower) - To use a PyOperator with functorch transforms, a developer needs to register a rule for each transform (vmap, grad, jvp, ...). - The PyOperator API is NOT user-facing. Things like autograd.Function support for functorch will end up going through the autograd.Function API. Question for reviewers: - Does this design make sense? - I'm trying to split up the "functorch support for autograd.Function" work into logical pieces. Would it be better if I didn't? (the full thing is a bit long - 1000-2000 LOC). Test Plan: - new tests that construct PyOperator and compose them with functorch transforms Pull Request resolved: https://github.com/pytorch/pytorch/pull/88785 Approved by: https://github.com/samdow, https://github.com/soulitzer	2022-11-16 00:46:59 +00:00
Sherlock Huang	133e61af7a	OpOverload is_view (#88722 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88722 Approved by: https://github.com/ezyang	2022-11-09 19:03:12 +00:00
Edward Z. Yang	53eac1d482	Revert "Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329 )"" (#88489 ) The bug was that I was accidentally caching at the wrong key name, so we were never actually hitting the cache. I've renamed the resolved key to final_key to avoid shadowing in this way. This reverts commit `410ce96a23`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88489 Approved by: https://github.com/albanD	2022-11-04 19:23:04 +00:00
PyTorch MergeBot	410ce96a23	Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329 )" This reverts commit `86c7cd287c`. Reverted https://github.com/pytorch/pytorch/pull/88329 on behalf of https://github.com/clee2000 due to test_decomp takes an extra 2 hours in some jobs, windows takes so long it times out	2022-11-03 21:57:19 +00:00
Edward Z. Yang	86c7cd287c	Put Python Dispatcher cache in dict, clear it on new registrations. (#88329 ) The motivation is that I am going to add the ability to temporarily install entries to the python dispatcher, and to do that, I need an easier way to clear the cache. Putting the cache in a dict centralizes cache clearing in one place. I then add some easy cache clearing. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88329 Approved by: https://github.com/albanD	2022-11-03 12:53:51 +00:00
Edward Z. Yang	97d3b200ca	Unconditionally enable python dispatcher in AOTAutograd (#88365 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88365 Approved by: https://github.com/Chillee	2022-11-03 12:52:19 +00:00
Sherlock Huang	eb99c1efce	Prefer python meta function over c++ meta function (#87426 ) This is a policy update for meta registration. We now prefer python meta implementation over C++ meta function. This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist. Here's the meta registration process: 1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`. However, they will NOT register them into dispatcher. 2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd. 3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases - 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta - 2. the op is a view op, as the MetaTensor doesn't support aliased storage - 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op) Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5 cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426 Approved by: https://github.com/ezyang, https://github.com/jansel	2022-10-25 16:49:02 +00:00
samdow	18d8c548f4	[Modes] remove enable and rewrite mode stack (squashed) (#84774 ) Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch\|function} This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup ### Background Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like ```python ## PRE-PR UX def f(mode): with mode.restore(): # user needs to understand this restore thing? ... with Mode() as m: pass f(m) ``` Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation" step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write ```python ## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR def f(mode): with mode: ... f(Mode()) ``` Technical Details With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-09-27 01:04:35 +00:00
Nikolay Korovaiko	b4f9b68225	should_check_strides (#85416 ) This PR ports `should_check_strides` checks from `origin/symbolic-shapes` to `master` as the part of our dynamic shapes landing effort. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85416 Approved by: https://github.com/ezyang	2022-09-23 04:55:50 +00:00
Edward Z. Yang	490727a35f	New calling convention for Python dispatcher (#85133 ) Instead of calling into the Python dispatcher for EVERY dispatcher call, we now have a two step process. First, we getattr(op: OpOverload, dispatch_key) to "load" the handler for the function. This can either be a conventional function (in which case we will call it, in the same way the old Python dispatcher worked), or it can be a DispatchKey, in which case we will directly call that DispatchKey in C++, bypassing marshalling between Python and C++ entirely. OpOverload.__getattr__ is carefully written so that it will cache the A further optimization would be to define __slots__ on OpOverload, and ensuring that the DispatchKey strings are interned. The resulting Python dispatcher is less flexible: after the first lookup, the handler is cached and we won't recompute it. Furthermore, by default, dispatches will not go into Python, and so you won't get stack frames for the Python dispatcher by default. But we get a huge performance improvement: on the following microbenchmark we go from 2.5s to 1.9s. ``` import time import torch from functorch import make_fx def f(x): for i in range(1000): x = x * x return x begin = time.time() res = make_fx(f, tracing_mode="symbolic")(torch.randn(10, 20)) print(time.time()-begin) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85133 Approved by: https://github.com/wconstab	2022-09-16 20:38:21 +00:00
Edward Z. Yang	1275e2df1f	Remove getattr magic method from OpOverload (#85090 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85090 Approved by: https://github.com/wconstab	2022-09-16 00:28:50 +00:00
Michael Voznesensky	8ca1839d32	Python Dispatcher integration with C++ dispatcher (#85050 ) #84826 but without ghstack Pull Request resolved: https://github.com/pytorch/pytorch/pull/85050 Approved by: https://github.com/malfet	2022-09-15 00:43:36 +00:00

1 2

92 Commits