Commit Graph

77 Commits

Author SHA1 Message Date
Animesh Jain
58d2c66a70 [activation checkpointing] Higher order functional rng op wrappers (#102934)
Introduces two higher order operators
* run_and_save_rng_state - Saves the current rng state and then runs the op.
* run_with_rng_state - Runs the op with the rng state supplied as an input

Ideally, we would like to use torch.compile for these operators. But currently the plan is to introduce these operators at the partitioner level, obviating the need to support them fully through the torch.compile stack. To ensure that we have good enough debugging with minifiers, we have ensure that they work with make_fx. In future, we can move on torch.compile.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102934
Approved by: https://github.com/jansel, https://github.com/zou3519
2023-06-12 22:54:17 +00:00
PyTorch MergeBot
d1f24f73da Revert "Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108)"
This reverts commit 194262ee49.

Reverted https://github.com/pytorch/pytorch/pull/103108 on behalf of https://github.com/izaitsevfb due to Breaks executorch internally, see D46581996 ([comment](https://github.com/pytorch/pytorch/pull/103108#issuecomment-1585041505))
2023-06-09 19:31:40 +00:00
Richard Zou
194262ee49 Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108)
Previously, defining a HigherOrderOperators (like cond) automatically generates
a torch.ops.cond and causes them to trace into the FX graph as e.g.
torch.ops.cond.

This is not good, because:
- Duplication. Since HigherOrderOperators are written in Python, they have an
associated Python function that users should access them from. E.g.
torch.cond (when we make it public). That is what should actually appear in the
graph.
- torch.ops.cond is a valid namespace for operator registration; having
it be a function too confuses things.

This PR:
- Moves cond/map HigherOrderOperators to be under torch (necessary for
the FX logic to not do weird things)
- Sets the `__module__` of a HigherOrderOperator correct. This is what
FX uses when tracing the operator.

Test Plan:
- updated tests

Future:
- I'll delete the ability to call cond as torch.ops.cond in a couple of
days, after this change circulates internally.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103108
Approved by: https://github.com/ydwu4
2023-06-08 01:55:27 +00:00
albanD
59dff01319 Add top level function to check if running with deploy (#101420)
Also not sure if this should be a public function or not. Leaving it private for now but let me know if you prefer for it to be public.

FYI @nikitaved this will logically conflict with your triton kernel PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101420
Approved by: https://github.com/malfet
2023-05-16 16:05:49 +00:00
PyTorch MergeBot
58f796ff5d Revert "Initial version of Dynamo capture for HigherOrderOperator (#99988)"
This reverts commit 4c99f9cdf2.

Reverted https://github.com/pytorch/pytorch/pull/99988 on behalf of https://github.com/atalman due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/99988#issuecomment-1533081452))
2023-05-03 14:02:40 +00:00
Richard Zou
4c99f9cdf2 Initial version of Dynamo capture for HigherOrderOperator (#99988)
This PR introduces a `wrap(body_fn, *args)` higher order operator
The semantics of `wrap(body_fn, *args)` is to just run `body_fn(*args)`

Underneath Dynamo, this PR makes it so that we rewrite calls to
`wrap(body_fn, *args)` with `wrap(new_fn, *new_args)` where `new_fn` has
no free variables. This PR does not update cond/map to use the new
mechanism yet (we do not support nn.Modues yet, will come in the future).

The design we take is:
- OutputGraph represents the graph being built by Dynamo that may be
compiled and executed.
- OutputGraph owns a root SubgraphTracer, where it builds the FX graph.
- OutputGraph may own multiple nested SubgraphTracers.
- When we need to trace the body function of a HigherOrderOperator, we
construct a new SubgraphTracer to build the graph of the body function.

Mechanically, when Dynamo sees a new `wrap` HigherOrderOperator with a
body function, it:
- Creates a new SubgraphTracer via OutputGraph.new_subtracer
- Executes the body function
This captures the body function into the graph on the new
SubgraphTracer while modifying the state of the OutputGraph. For
example, the OutputGraph may receive new GraphArgs, new guards, and new
side effects.

If capture of the body function fails, then Dynamo graph breaks on the
HigherOrderOperator.

Test Plan:
- added test/dynamo/test_higher_order_ops.py

Future:
- We're not actually able to tell Dynamo to completely graph break on the
HigherOrderOperator. Instead, when we do graph break, Dynamo begins
introspecting `HigherOrderOperator.__call__`. It should probably not do
this.
- Ideally we would error out on new SideEffects. I don't know how to do
this yet.
- We don't support dealing with nn.Modules yet (e.g. calling nn.Modules
or accessing attributes of tracked nn.Modules from a body_fn). There's
an open question on what should actually happen here
- Ideally we would rewrite map/cond to use the new mechanism but we need
to fix the previous bullet point before we can get there.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99988
Approved by: https://github.com/voznesenskym, https://github.com/anijain2305
2023-05-02 17:11:02 +00:00
Richard Zou
f21a176c03 Python Dispatcher should respect FuncTorchBatchedDecomposition key (#98328)
Fixes https://github.com/pytorch/pytorch/issues/97425.

Python Dispatcher's resolve_key function should be equivalent to
computeDispatchTableEntryWithDebug. We added a section to
computeDispatchTableEntryWithDebug but forgot to add it to resolve_key.

This PR fixes that discrepancy.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98328
Approved by: https://github.com/Chillee, https://github.com/kshitij12345, https://github.com/Neilblaze
2023-04-05 20:32:53 +00:00
Edward Z. Yang
fa4c77e39b Rename PyOperator to HigherOrderOperator (#97493)
Twice this week I have had people confuse "operator defined with Python
operator registration aka torch.library" and "PyOperator which is used
to define control flow operators and other operators that cannot be
represented in JIT schema."  Renaming PyOperator for clarity.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97493
Approved by: https://github.com/SherlockNoMad
2023-03-24 05:04:02 +00:00
Brian Hirsh
af440c427b [draft for discussion] add per-dispatch key modes (#97052)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97052
Approved by: https://github.com/ezyang, https://github.com/zou3519
2023-03-21 23:45:45 +00:00
BowenBao
60a68477a6 Bump black version to 23.1.0 (#96578)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578
Approved by: https://github.com/ezyang
2023-03-15 06:27:59 +00:00
Edward Z. Yang
6a675f7cac Correctly resolve dispatch keys for PyOperator (#96306)
Previously, we never actually used resolve_key, which meant that
you had to register CPU/CUDA/etc all manually; none of the alias
keys worked.  Now they work.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96306
Approved by: https://github.com/Skylion007, https://github.com/zou3519
2023-03-09 22:16:31 +00:00
Edward Z. Yang
32ffd70644 Rewrite fallthrough to more closely match how C++ works (#96304)
Fallthrough is modeled as a mask which we use to remove keys from the
compute dispatch key set for eligibility.

It's possible this addresses https://github.com/pytorch/pytorch/issues/89037
in a better way than https://github.com/pytorch/pytorch/pull/95891 but I
cannot easily tell as the original repro no longer works and the new PR
does not have a test.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96304
Approved by: https://github.com/zou3519, https://github.com/albanD, https://github.com/zhxchen17
2023-03-08 23:00:26 +00:00
Edward Z. Yang
67c329bc9b Refactor to reduce duplicate logic in torch._ops (#96302)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96302
Approved by: https://github.com/zou3519
2023-03-08 23:00:26 +00:00
Nikita Shulga
941ff109d3 dl_open_guard should restore flag even after exception (#96231)
I.e. follow pattern outlined in https://docs.python.org/3.8/library/contextlib.html#contextlib.contextmanager

Also, return early on non-unix platforms (when `sys.getdlopenflags` is not defined)

Fixes https://github.com/pytorch/pytorch/issues/96159

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96231
Approved by: https://github.com/atalman
2023-03-08 06:01:27 +00:00
Xuehai Pan
5b1cedacde [BE] [2/3] Rewrite super() calls in functorch and torch (#94588)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-10 21:16:33 +00:00
Angela Yi
5fdddbbfe8 Fix checking of current mode in PyOperator dispatch (#92357)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92357
Approved by: https://github.com/voznesenskym
2023-01-18 23:08:36 +00:00
Richard Zou
da42eab48b Fix circular import in torch/autograd/function.py (#90415)
It turns out it is possible to break cycles by not directly importing a
module:
- there's a problem that torch.jit imports torch._ops and torch._ops
import torch.jit
- there's another problem that torch.autograd.function imports
custom_function_call but torch._functorch.autograd_function imports
torch.autograd.function

The "better" way to handle all of this is to do some large refactoring so
that torch._functorch.autograd_function imports some file that has
_SingleLevelAutogradFunction and then have torch.autograd.function
depend on torch.functorch.autograd_function... (and ditto for torch.jit
vs torch._ops), but I'm scared to move code around too much for BC
reasons and the fix in this PR works well.

Test Plan:
- import torch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90415
Approved by: https://github.com/albanD, https://github.com/soulitzer
2022-12-14 16:20:57 +00:00
Edward Z. Yang
5266953443 Add crossref debug mode for functionalization, catches stride errors (#89498)
The idea is to add a custom handler to Functionalize key in Python
dispatcher that runs the functionalized version along side a non
functionalized version, and checks that their outputs agree in the
end.  (Technically, for metadata mutation we should also check the
inputs, but for now we're relying on those functions returning self.)
I turned this on for test_functionalize.py (new TestCrossRefFunctionalize)
and found a bunch of failures that look legit.

This probably doesn't interact that nicely if you're also tracing at
the same time, probably need more special logic for that (directly,
just disabling tracing for when we create the nested fake tensor mode,
but IDK if there's a more principled way to organize this.)

There are some misc fixups which I can split if people really want.

- xfail_inherited_tests moved to test common_utils
- Bindings for _dispatch_tls_set_dispatch_key_included,
  _dispatch_tls_is_dispatch_key_included and _functionalization_reapply_views_tls
- Type stubs for _enable_functionalization, _disable_functionalization
- all_known_overloads utility to let you iterate over all OpOverloads
  in all namespaces.  Iterator support on all torch._ops objects to let
  you iterate over their members.
- suspend_functionalization lets you temporarily disable functionalization mode
  in a context
- check_metadata_matches for easily comparing outputs of functions and see
  if they match (TODO: there are a few copies of this logic, consolidate!)
- _fmt for easily printing the metadata of a tensor without its data
- _uncache_dispatch for removing a particular dispatch key from the cache,
  so that we force it to regenerate
- check_significant_strides new kwarg only_cuda to let you also do stride
  test even when inputs are not CUDA
- Functionalize in torch._C.DispatchKey

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89498
Approved by: https://github.com/malfet
2022-11-23 04:18:25 +00:00
Edward Z. Yang
5582001bd5 Reland 2 "Towards unifying symbolic and non symbolic fake tensor (#89038) (#89143)" (#89346)
This reverts commit 8e4c9828f4.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89346
Approved by: https://github.com/wconstab
2022-11-19 21:14:31 +00:00
PyTorch MergeBot
8e4c9828f4 Revert "Reland "Towards unifying symbolic and non symbolic fake tensor (#89038)" (#89143)"
This reverts commit e686b8c3ba.

Reverted https://github.com/pytorch/pytorch/pull/89143 on behalf of https://github.com/ZainRizvi due to This seems to be causing the test_make_fx_symbolic_exhaustive_rad2deg_cpu_float32 and test_make_fx_symbolic_exhaustive_inplace_rad2deg_cpu_float32 test to fail across multiple jobs
2022-11-17 17:02:36 +00:00
Edward Z. Yang
e686b8c3ba Reland "Towards unifying symbolic and non symbolic fake tensor (#89038)" (#89143)
This reverts commit cf6003f046.

Differential Revision: [D41363992](https://our.internmc.facebook.com/intern/diff/D41363992)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89143
Approved by: https://github.com/albanD
2022-11-17 13:55:06 +00:00
PyTorch MergeBot
cf6003f046 Revert "Towards unifying symbolic and non symbolic fake tensor (#89038)"
This reverts commit 37d54239c7.

Reverted https://github.com/pytorch/pytorch/pull/89038 on behalf of https://github.com/ezyang due to executorch segfaults
2022-11-16 16:52:47 +00:00
Edward Z. Yang
37d54239c7 Towards unifying symbolic and non symbolic fake tensor (#89038)
Fake tensor behaves pretty differently depending on if you have
symbolic shapes or not.  This leads to bugs; for example, we
weren't getting correct convolution_backward strides because we
bypassed the correct stride logic in fake tensor on symbolic
shapes.

This PR attempts to unify the two codepaths.  I don't manage to
unify everything, but I get most of it.  The algorithm is delicate
and I'm still hosing down test failures.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89038
Approved by: https://github.com/anjali411
2022-11-16 14:02:43 +00:00
Richard Zou
3bc327993f PyDispatcher integration with functorch (#88785)
This PR teaches PyDispatcher and PyOperator about functorch transforms.
It is important that PyDispatcher/PyOperator dispatch with functorch
transforms, because this is our plan for higher-order operators
(operators that accept functions as arguments). Examples of these
include:
- functorch transforms over the existing cond operator (control flow)
- autograd.Function support for functorch (which I am working towards),
- AOTDispatcher (should be a higher order operator)

Concretely, the problem with teaching PyDispatcher/PyOperator about
functorch is that the stack-based dispatching logic (DynamicLayerStack)
is hidden inside the fallbacks for two dispatch keys
(DynamicLayer{Front, Back}). PyDispatcher doesn't know about C++ boxed
fallbacks, our plan on record for that is that we need to reimplement
all of them in Python (but can call helper functions in C++ to make our
lives easier).

Instead of exposing all of what DynamicLayer{Front, Back} do to python,
this PR takes the approach of re-implementing part of the stack-based
dispatching in Python. The motivation is that this is more sane and
follows what the "ideal" implementation of functorch would have been:
- each transform should be a "mode"
- there should be no TLS dispatch key set hackery. functorch needs to do
this hackery today to re-use VariableType implementations.

This PR:
- exposes the DynamicLayerStack to Python
- The DynamicLayerStack is a stack of Interpreters.
These get exposed to Python as well.
- Interpreters can run operations (Interpreter.process) or lower them to
the next interpreter in the stack (Interpreter.lower)
- To use a PyOperator with functorch transforms, a developer needs to
register a rule for each transform (vmap, grad, jvp, ...).
- The PyOperator API is NOT user-facing. Things like autograd.Function
support for functorch will end up going through the autograd.Function
API.

Question for reviewers:
- Does this design make sense?
- I'm trying to split up the "functorch support for autograd.Function"
work into logical pieces. Would it be better if I didn't? (the full
thing is a bit long - 1000-2000 LOC).

Test Plan:
- new tests that construct PyOperator and compose them with functorch
transforms
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88785
Approved by: https://github.com/samdow, https://github.com/soulitzer
2022-11-16 00:46:59 +00:00
Sherlock Huang
133e61af7a OpOverload is_view (#88722)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88722
Approved by: https://github.com/ezyang
2022-11-09 19:03:12 +00:00
Edward Z. Yang
53eac1d482 Revert "Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329)"" (#88489)
The bug was that I was accidentally caching at the wrong key name, so
we were never actually hitting the cache.  I've renamed the resolved
key to final_key to avoid shadowing in this way.

This reverts commit 410ce96a23.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88489
Approved by: https://github.com/albanD
2022-11-04 19:23:04 +00:00
PyTorch MergeBot
410ce96a23 Revert "Put Python Dispatcher cache in dict, clear it on new registrations. (#88329)"
This reverts commit 86c7cd287c.

Reverted https://github.com/pytorch/pytorch/pull/88329 on behalf of https://github.com/clee2000 due to test_decomp takes an extra 2 hours in some jobs, windows takes so long it times out
2022-11-03 21:57:19 +00:00
Edward Z. Yang
86c7cd287c Put Python Dispatcher cache in dict, clear it on new registrations. (#88329)
The motivation is that I am going to add the ability to temporarily
install entries to the python dispatcher, and to do that, I need
an easier way to clear the cache.  Putting the cache in a dict
centralizes cache clearing in one place.  I then add some easy
cache clearing.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88329
Approved by: https://github.com/albanD
2022-11-03 12:53:51 +00:00
Edward Z. Yang
97d3b200ca Unconditionally enable python dispatcher in AOTAutograd (#88365)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88365
Approved by: https://github.com/Chillee
2022-11-03 12:52:19 +00:00
Sherlock Huang
eb99c1efce Prefer python meta function over c++ meta function (#87426)
This is a policy update for meta registration. **We now prefer python meta implementation over C++ meta function.**  This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist.

Here's the meta registration process:
1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`.  However, they will NOT register them into dispatcher.
2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd.
3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases
- 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta
- 2. the op is a view op, as the MetaTensor doesn't support aliased storage
- 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op)

Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5

cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426
Approved by: https://github.com/ezyang, https://github.com/jansel
2022-10-25 16:49:02 +00:00
samdow
18d8c548f4 [Modes] remove enable and rewrite mode stack (squashed) (#84774)
Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch|function}

This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily

Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup

### Background
Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like

```python
## PRE-PR UX
def f(mode):
  with mode.restore():  # user needs to understand this restore thing?
    ...

with Mode() as m:
  pass
f(m)
```

Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation"  step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write
```python
## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR
def f(mode):
  with mode:
    ...
f(Mode())
```

** Technical Details **
With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-09-27 01:04:35 +00:00
Nikolay Korovaiko
b4f9b68225 should_check_strides (#85416)
This PR ports `should_check_strides` checks from `origin/symbolic-shapes` to `master` as the part of our dynamic shapes landing effort.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85416
Approved by: https://github.com/ezyang
2022-09-23 04:55:50 +00:00
Edward Z. Yang
490727a35f New calling convention for Python dispatcher (#85133)
Instead of calling into the Python dispatcher for EVERY dispatcher
call, we now have a two step process.  First, we
getattr(op: OpOverload, dispatch_key) to "load" the handler for the
function.  This can either be a conventional function (in which
case we will call it, in the same way the old Python dispatcher
worked), or it can be a DispatchKey, in which case we will directly
call that DispatchKey in C++, bypassing marshalling between Python
and C++ entirely.  OpOverload.__getattr__ is carefully written so
that it will cache the

A further optimization would be to define __slots__ on OpOverload,
and ensuring that the DispatchKey strings are interned.

The resulting Python dispatcher is less flexible: after the first
lookup, the handler is cached and we won't recompute it.  Furthermore,
by default, dispatches will not go into Python, and so you won't
get stack frames for the Python dispatcher by default.  But we get
a huge performance improvement: on the following microbenchmark
we go from 2.5s to 1.9s.

```
import time
import torch
from functorch import make_fx

def f(x):
    for i in range(1000):
        x = x * x
    return x

begin = time.time()
res = make_fx(f, tracing_mode="symbolic")(torch.randn(10, 20))
print(time.time()-begin)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85133
Approved by: https://github.com/wconstab
2022-09-16 20:38:21 +00:00
Edward Z. Yang
1275e2df1f Remove getattr magic method from OpOverload (#85090)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85090
Approved by: https://github.com/wconstab
2022-09-16 00:28:50 +00:00
Michael Voznesensky
8ca1839d32 Python Dispatcher integration with C++ dispatcher (#85050)
#84826 but without ghstack
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85050
Approved by: https://github.com/malfet
2022-09-15 00:43:36 +00:00
PyTorch MergeBot
706b990306 Revert "Python Dispatcher integration with C++ dispatcher (#84826)"
This reverts commit 35f6a69191.

Reverted https://github.com/pytorch/pytorch/pull/84826 on behalf of https://github.com/malfet due to Broke dynamo, see 35f6a69191
2022-09-14 14:07:58 +00:00
Michael Voznesensky
35f6a69191 Python Dispatcher integration with C++ dispatcher (#84826)
Signed-off-by: Edward Z. Yang <ezyangfb.com>

From @ezyang's original PR:

There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients:

We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation
The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch.
I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See https://github.com/pybind/pybind11/issues/483#issuecomment-1237418106 for how this works; the technique is generally useful.

I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84826
Approved by: https://github.com/ezyang
2022-09-14 06:57:19 +00:00
Edward Z. Yang
964fde7d7c Raise AttributeError for __origin__ to avoid C++ exception raise (#84880)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84880
Approved by: https://github.com/wconstab
2022-09-12 23:37:53 +00:00
Edward Z. Yang
988bd0173c Add OpOverload.decompose API (#83075)
This allows you to directly call into the CompositeImplicitAutograd
implementation of an operator, *without* changing any aspects of the
dispatcher state.  In particular, you can use this to recursively call
into a decomposition, dispatching back to your tensor subclass/mode
as desired.

Hypothetically, we should also make these available in the
decompositions dictionary, but I'm leaving this as future work as
enumerating these decompositions is annoying (as operators are lazily
registered.)

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83075
Approved by: https://github.com/albanD
2022-08-09 18:53:19 +00:00
Sherlock Huang
3d5e49e91d Better error message for torch.library (#82904)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82904
Approved by: https://github.com/davidberard98
2022-08-05 23:33:53 +00:00
Huy Do
12cb26509a Apply ufmt to torch internal (#81643)
This is a big bang PR, merge conflicts are probably expected and will be addressed at merge.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81643
Approved by: https://github.com/ezyang
2022-07-22 02:19:50 +00:00
Edward Wang (EcoF)
ba4f780fde fix submodule imports by importing functions directly (#79368)
Summary:
fixes two sporadic issues from missing attributes:

- breaking circular imports
- submodule not being imported explicitly

Test Plan: sandcastle

Reviewed By: ehhuang

Differential Revision: D37071652

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79368
Approved by: https://github.com/ananthsub
2022-06-22 08:01:23 +00:00
anjali411
38350acf8f Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322

Approved by: https://github.com/albanD
2022-06-11 00:29:32 +00:00
PyTorch MergeBot
954522a485 Revert "Autogen Tags enum, and allow specifying tags while defining an op"
This reverts commit 9476a78f37.

Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see 9476a78f37
2022-06-03 01:53:53 +00:00
anjali411
9476a78f37 Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-06-03 01:13:44 +00:00
Eric Huang (AI Platform)
0d8c9406de [S271837] alias torch._utils_internal (#78597)
Summary:
This is an attempt to fix error:

```
...

  File "/mnt/xarfuse/uid-202090/bfc19816-seed-6bb89c2f-e1c0-4c48-8376-657c47aee7ea-ns-4026533122/torchvision/__init__.py", line 5, in <module>
    from torchvision import datasets
  File "/mnt/xarfuse/uid-202090/bfc19816-seed-6bb89c2f-e1c0-4c48-8376-657c47aee7ea-ns-4026533122/torchvision/datasets/__init__.py", line 1, in <module>
    from ._optical_flow import KittiFlow, Sintel, FlyingChairs, FlyingThings3D, HD1 (2d884f2263)K
  File "/mnt/xarfuse/uid-202090/bfc19816-seed-6bb89c2f-e1c0-4c48-8376-657c47aee7ea-ns-4026533122/torchvision/datasets/_optical_flow.py", line 12, in <module>
    from ..io.image import _read_png_16
  File "/mnt/xarfuse/uid-202090/bfc19816-seed-6bb89c2f-e1c0-4c48-8376-657c47aee7ea-ns-4026533122/torchvision/io/__init__.py", line 11, in <module>
    from ._video_opt import (
  File "/mnt/xarfuse/uid-202090/bfc19816-seed-6bb89c2f-e1c0-4c48-8376-657c47aee7ea-ns-4026533122/torchvision/io/_video_opt.py", line 8, in <module>
    from ..extension import _load_library
  File "/mnt/xarfuse/uid-202090/bfc19816-seed-6bb89c2f-e1c0-4c48-8376-657c47aee7ea-ns-4026533122/torchvision/extension.py", line 20, in <module>
    torch.ops.load_library(lib_path)
  File "/mnt/xarfuse/uid-202090/bfc19816-seed-6bb89c2f-e1c0-4c48-8376-657c47aee7ea-ns-4026533122/torch/_ops.py", line 250, in load_library
    path = torch._utils_internal.resolve_library_path(path)
AttributeError: module 'torch' has no attribute '_utils_internal'
```

I don't completely understand the import mechanism, but I tried this in notebook:

```
# module 'torch' has no attribute '_torch_docs'
import torch._torch_docs
torch._torch_docs.common_args

# works
import torch._torch_docs as t
t.common_args
```

Figured I'd try it to see if it works, because in these files, there does exist a `import torch._utils_internal` at the top which didn't fail.

Test Plan: sc

Differential Revision: D36795248

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78597
Approved by: https://github.com/ngimel
2022-06-01 05:21:52 +00:00
Edward Z. Yang
3a6da16a5a Return all overloads for an operator in _jit_get_operation
This allows us to provide OpOverloadPacket.overloads method that
lists all of the overloads.

This isn't tested; will be exercised in the next PR.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76814

Approved by: https://github.com/mruberry
2022-05-04 23:49:47 +00:00
Edward Z. Yang
a3f10ec281 Move functorch decompositions to PyTorch
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76311

Approved by: https://github.com/Chillee
2022-04-30 16:47:53 +00:00
anjali411
086645ad77 Update __torch_dispatch__ to return op overload instead of the opoverload packet function (#72673)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72673

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D34627164

Pulled By: anjali411

fbshipit-source-id: 3cb6406a392d530bf9da36b4d8e0a62b30e6497e
(cherry picked from commit 65b85a0a67df4d0f16ac8964e2b685d478a610fb)
2022-03-07 22:38:42 +00:00
anjali411
50770b9e19 Add torch.ops per overload API (#72206)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72206

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33955635

Pulled By: anjali411

fbshipit-source-id: 4fbd0c0c4d032bcbd9d9f362b4b6f84eec9ad047
(cherry picked from commit 85076245c9)
2022-02-11 17:19:06 +00:00