- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports
limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports
limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
TestCustomOp's tests uses helper attributes and functions from a util parent class. To support arbitrary test classes, we need to refactor the current approach. Instead of allowlisting certain methods, we can instead copy the whole class and only overwrite the "test_.*" methods.
Compiled autograd fails on ~10/90 of the newly added tests. test_autograd_function_backed_op is the example we discussed in PT-2D meeting about requiring c++ autograd::Function support. I'm addressing this in #120732
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120679
Approved by: https://github.com/jansel, https://github.com/zou3519
Summary:
We've made the following changes:
- The new way to use the API is `m.impl_abstract_pystub(module, context)`.
Every subsequent m.def of an op inside the TORCH_LIBRARY block gives
the op the `impl_abstract_pystub`.
- Added a mechanism to determine if an operator was defined in Python or C++.
Library.define in Python appends the op to a global set, which is analogous
to what we do for tracking Library.impl.
- If someone does `torch.library.impl_abstract` in Python for an operator, then
we require that it has an `impl_abstract_pystub` specified and we also check
that the module in the `impl_abstract_pystub` is the same as the module where
the call to `torch.library.impl_abstract` exists.
- Unfortunately we can't check the "context" (which is the buck target on
buck-based systems) because buck sits above us.
bypass-github-export-checks
Test Plan: - existing tests
Differential Revision: D51080493
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113182
Approved by: https://github.com/ezyang
Summary:
We've made the following changes:
- The new way to use the API is `m.impl_abstract_pystub(module, context)`.
Every subsequent m.def of an op inside the TORCH_LIBRARY block gives
the op the `impl_abstract_pystub`.
- Added a mechanism to determine if an operator was defined in Python or C++.
Library.define in Python appends the op to a global set, which is analogous
to what we do for tracking Library.impl.
- If someone does `torch.library.impl_abstract` in Python for an operator, then
we require that it has an `impl_abstract_pystub` specified and we also check
that the module in the `impl_abstract_pystub` is the same as the module where
the call to `torch.library.impl_abstract` exists.
- Unfortunately we can't check the "context" (which is the buck target on
buck-based systems) because buck sits above us.
Test Plan: - existing tests
Differential Revision: D50972148
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112851
Approved by: https://github.com/ezyang
Summary:
If there are xfails in the failures_dict and the operator has the
pt2_compliant_tag, then we raise an error. These generated tests are separate
from those in the failures dict because we don't actually need any sample
inputs to check this.
Test Plan: - New tests
Differential Revision: D50936201
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112759
Approved by: https://github.com/ezyang
Unlike the previous torch.library.define, this schema doesn't take a
name (the name is a part of the qualname). We separated out the qualname
from the schema in the new APIs so that they're all consistent with each
other (they all accept the qualname separately).
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111915
Approved by: https://github.com/suo, https://github.com/ezyang
ghstack dependencies: #111912
torch.library.impl now accepts a device string (e.g. "cpu", "cuda"). It
still accepts DispatchKey strings, but we no longer document this, because
using arbitrary DispatchKeys is more for the power users.
We map the device string to a DispatchKey and then register the impl for
said DispatchKey. A user may also specify multiple device strings at once
or specify "types=default" to get a CompositeExplicitAutograd registration.
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111659
Approved by: https://github.com/soulitzer
ghstack dependencies: #111380
We add a new overload to torch.library.impl that accepts an optional
Library arg. If provided, the lifetime of the registration will be
tied to the Library arg, otherwise, it will live forever.
Test Plan:
- existing and new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111308
Approved by: https://github.com/soulitzer
ghstack dependencies: #111307
Summary:
Make it easier to add `generate_opcheck_tests` by adding defaults for
the failures_dict location, the additional decorators, and the test
utils.
Test Plan:
Existing tests
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110977
Approved by: https://github.com/williamwen42
ghstack dependencies: #110951
This PR adds the following helper functions for generated opcheck tests:
- dontGenerateOpCheckTests is a decorator that skips generation of the
opcheck tests for the generated function
- is_inside_opcheck_mode lets us query if we are in a generated test.
Useful for fast debugging out-of-tree without needing to update
PyTorch.
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110951
Approved by: https://github.com/williamwen42
This PR allows us to use the same failures_dict for multiple test
classes. This is helpful if you have a bunch of small TestCase(es) and
to centralize all the failures dict into one big one.
Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110164
Approved by: https://github.com/williamwen42
Changelog:
- torch.library.impl_abstract optionally accepts a torch.library.Library
object. If passed in, then the lifetime of the registration is tied to
the Library object.
- we've also changed torch.library.impl_abstract to work on all
operators, including overloads.
- we refactored the `torch._custom_ops.*` and `torch._custom_op.*`
impl_abstract APIs and put them under torch._library. This is the
final resting place for them. I will follow-up with deleting
all the `torch._custom_ops.*` stuff later.
- There is a new "SimpleOperatorRegistry" where we actually collect the
abstract_impl. We will expand this to also hold the other
torch._custom_ops.* APIs when we move those to torch.library
NB: Previously we had designed
`impl_abstract` assuming a very high-level Python-only custom op API.
We've revisited that since; now, impl_abstract works for all custom ops,
no matter python or C++, no matter the schema. The new refactored design
reflects this better.
Test Plan:
- existing and new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109912
Approved by: https://github.com/ezyang
We want users to be able to define custom ops in C++ but put the
abstract impl in Python (since it is easier to write them in Python and
the abstract impl better models device semantics and data-dependent
operators).
`m.impl_abstract_pystub(opname, python_module, context)` declares the
abstract_impl of the operator to exist in the given python module.
When the abstract_impl needs to be accessed (either via FakeTensor or
Meta), and it does not exist, the PyTorch Dispatcher will yell
with a descriptive error message.
Some details:
- We construct a new global AbstractImplPyStub mapping in
Dispatcher.cpp. Read/write to this map is protected by the Dispatcher
lock.
- We add a new Meta Tensor fallback kernel. The fallback errors out if there is
no meta kernel, but also offers a nicer error message if we see that there is
a pystub.
- We create a `torch._utils_internal.throw_abstract_impl_not_imported_error`
helper function to throw errors. This way, we can throw different error
messages in OSS PyTorch vs internal PyTorch. To invoke this from C++, we
added a PyInterpreter::throw_abstract_impl_not_imported_error.
Differential Revision: [D49464753](https://our.internmc.facebook.com/intern/diff/D49464753/)
Differential Revision: [D49464753](https://our.internmc.facebook.com/intern/diff/D49464753)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109529
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
On failure of a test, we will always print a "repro". This repro isn't
really runnable but gives the user a sense of how to actually reproduce
the test without the test suite, because using the test suite is a bit
convoluted.
If the user passes PYTORCH_OPCHECK_PRINT_BETTER_REPRO, we will print a
fuller repro that saves the exact problematic test inputs to disk and
reads them back out.
Test Plan:
- expecttests on the generate_repro helper function
- tried this out locally.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109640
Approved by: https://github.com/bdhirsh, https://github.com/soulitzer
ghstack dependencies: #109637, #109638, #109639
**Motivation:**
We try to make torch.cond use torch.compile automatically so that we could error out when there is side-effects in the branches and correctly handle the closures.
Before this PR, we have a warning if we don't turn on a config raise_on_backend_change (turning it on gives us an error) for the following code:
```python
def foo()
# Inside torch.cond, we'd like to do something like
torch.compile(foo, backend="eager", fullgraph=True)(...)
...
# Users may then call torch.compile somewhere else.
# Dynamo will use the cached code of foo for "eager" backend
# but we expect dynamo to recompile with "inductor" backend.
torch.compile(foo, backend="inductor")(...)
```
This PR adds a BACKEND_MATCH guard. Effectively, it implements a per-backend cache. In the above example, the cached code for "eager" won't work for "inductor" due to guard check failures and the second torch.compile will do a re-compilation. In the future, it might be useful to have something like a configuration guard that guards against dynamo configuration changes across different compiles (e.g. compile a function with fullgraph=False then compile it again with fullgraph=True).
**Implementation:**
1. We add a guarded_backend_cache and check the most_recent_backend against the backend associated with cached code. We also remove the raise_on_backend_change flag.
Note: More lines are printed for debug log due to newly added context manager and guard adds .
**Test Plan:**
Removed original tests that raise on different backend and add a new test to test whether the BACKEND_MATCH guard can guard against backend change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107337
Approved by: https://github.com/jansel
- Update cross-ref FakeMode test to use ShapeEnv. Dynamic ops can now
return an unbacked SymInt. We always accept this as equal to whatever
the real value was.
- Relax test so it works on all classes, not just unittest.TestCase
- Properly wrap the original method, so things like
pytree.mark.parametrize are carried over
- Support dynamic shapes by default for make_fx `tracing_mode="fake"` without symbolifying everything else
Fixes https://github.com/pytorch/pytorch/issues/108927
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108929
Approved by: https://github.com/zou3519
We changed the failures_dict format from .py to json and added a way to
automatically update the failures dict (the user can set
PYTORCH_OPCHECK_ACCEPT=1 to do so), assuming the tests don't crash in the
process.
Some details:
- We introduced a FailuresDict class that handles save/load and from which one
can query a test status ("xfail", "skip", etc).
- PYTORCH_OPCHECK_ACCEPT=1 does not override everything. In particular: it
doesn't try to update the failures dict for a test marked as "skip", but it
will update it for tests marked as "xfail" or "success".
- PYTORCH_OPCHECK_ACCEPT=1 also does not override the "comment" field, unless
it is flipping an "xfail" into "success".
- I'll update the gdoc linked in the comments with how to actually use
PYTORCH_OPCHECK_ACCEPT=1 internally (it's not trivial).
Note that this isn't multithreading-safe, the current recommendation is to run
the tests sequentially if the user wants to use PYTORCH_OPCHECK_ACCEPT=1.
Differential Revision: D49167181
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109110
Approved by: https://github.com/ezyang
**Motivation:**
We try to make torch.cond use torch.compile automatically so that we could error out when there is side-effects in the branches and correctly handle the closures.
Before this PR, we have a warning if we don't turn on a config raise_on_backend_change (turning it on gives us an error) for the following code:
```python
def foo()
# Inside torch.cond, we'd like to do something like
torch.compile(foo, backend="eager", fullgraph=True)(...)
...
# Users may then call torch.compile somewhere else.
# Dynamo will use the cached code of foo for "eager" backend
# but we expect dynamo to recompile with "inductor" backend.
torch.compile(foo, backend="inductor")(...)
```
This PR adds a BACKEND_MATCH guard. Effectively, it implements a per-backend cache. In the above example, the cached code for "eager" won't work for "inductor" due to guard check failures and the second torch.compile will do a re-compilation. In the future, it might be useful to have something like a configuration guard that guards against dynamo configuration changes across different compiles (e.g. compile a function with fullgraph=False then compile it again with fullgraph=True).
**Implementation:**
1. We add a guarded_backend_cache and check the most_recent_backend against the backend associated with cached code. We also remove the raise_on_backend_change flag.
2. Then newly added context manager and guard adds more lines for debug log so we change the uppper limit from 50 to 55.
**Test Plan:**
Removed original tests that raise on different backend and add a new test to test whether the BACKEND_MATCH guard can guard against backend change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107337
Approved by: https://github.com/jansel