Commit Graph

195 Commits

Author SHA1 Message Date
Richard Zou
eaeea62ee4 Make TestPythonRegistration clean up after itself (#102292)
We did this for TestCustomOp, now we are applying the same thing to
TestPythonRegistration.

This PR:
- changes TestPythonRegistration to register new ops under a single
namespace (self.test_ns)
- clean up the namespace by deleting it from torch.ops after each test
is done running.

This avoids a problem where if an op is re-defined, torch.ops.myns.op
crashes because we do some caching. The workaround in many of these
tests have been to just create an op with a different name, but this PR
makes it so that we don't need to do this.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102292
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2023-06-02 13:36:50 +00:00
Edward Z. Yang
3318a832b3 Tighten FakeTensor reentrancy asserts, add debugging (#102091)
When investigating failures in https://github.com/pytorch/pytorch/pull/100017 I realized that we were reentering FakeTensorMode even though there was already one on the stack. Although we have attempted assert for these cases in the past, e.g., as in https://github.com/pytorch/pytorch/pull/97186 it seems that the existing protections were insufficient.

In this particular case, the reapplication of FakeTensorMode was due to an interaction with NotImplemented multiple dispatch handling. If proxy tensor mode detects an unrecognized tensor type (this includes FakeTensor, if it is not tracked with a proxy), it will return NotImplemented to give this tensor a chance to unpack itself into proxyable operation. However, this is never the right thing for FakeTensor, where no unpacking is possible. However, today, FakeTensor attempts to reapply the FakeTensorMode, resulting in FakeTensorMode being twice on the stack.

This PR does a number of things:

* It adds an assert in `FakeTensorMode.__torch_dispatch__` that you must not already have this mode on the stack, this is ALWAYS an error
* It modifies `FakeTensor.__torch_dispatch__` to return `NotImplemented` if the mode is already active. This prevents us from readding the mode on the stack
* It adds a new logging artifact `not_implemented` which you can use to get debug logs about all of the times a `__torch_dispatch__` handler returned NotImplemented and why it did so. Your subclass has to manually opt into this logging, but I inserted the necessary logs for ProxyTensorMode and FakeTensor(Mode)
* `with fake_mode` now no-ops if the fake mode is already on the stack, which is what users want anyway
* I am BREAKING pre-autograd tracing, because it is currently doing something weird with the original C++ mode stack. Brian is going to follow up with a fix next week.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102091
Approved by: https://github.com/thiagocrepaldi, https://github.com/eellison, https://github.com/wanchaol, https://github.com/bdhirsh
2023-05-24 05:37:51 +00:00
Richard Zou
723f111545 [custom_op] explicit autograd API (#101824)
This PR adds an explicit API for registering a backward formula for a
CustomOp. In the end state, we will likely have this explicit API and a
magic API (which is sugar on top of an explicit API), since different
parties of users prefer different ones.

Concretely, to define a backward formula for a CustomOp:
- a user must provide us a "save for backward" function that accepts
(inputs, output) and returns exactly what they want saved for backward
- a user must provide us a "backward" function that accepts
(ctx, saved, *grads) and returns us the grad_inputs. The grad_inputs
are returned as a dict mapping str to a gradient.
Please see the changes in custom_op_db.py for examples of the API.

There are a number of pieces to this PR and I'm happy to split it if it
helps. They are:
- The actual APIs for specifying the two functions
(impl_save_for_backward, impl_backward)
- The autograd kernel: we take the functions the user give us and
construct an autograd.Function object that we then register to
the Autograd dispatch key
- Indirection for the autograd kernel. We add a layer of indirection so
that one can swap out the autograd kernel. This is necessary because by
default, we register an "autograd not implemented" kernel as the
Autograd implementation but then swap it for the actual kernel when the
user provides it.

Test Plan:
- We apply this API to give backward formulas for things in
custom_op_db. We then hook up custom_op_db to the Autograd OpInfo tests.
- Various tests in test_python_dispatch.py to check error cases.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101824
Approved by: https://github.com/ezyang
2023-05-23 18:31:29 +00:00
Richard Zou
8487105fae [custom_op] Create a new torch._custom_op namespace (#101823)
torch/custom_op.py is getting long, and the autograd pieces are going to
make it even longer. I'm planning on just organizing the files under
a torch/_custom_op folder.

Note that the imports now look a bit crazy (from torch._custom_op.impl
import...) but they will look more OK when we figure out the plan to
make custom_op public (coming later).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101823
Approved by: https://github.com/ezyang, https://github.com/albanD, https://github.com/bdhirsh
2023-05-23 18:31:29 +00:00
Richard Zou
73d1be8e99 [custom_op] Add a test for symints (#101822)
Tests that a custom op annotated with Sequence[int] actually accepts
Sequence[SymInt].
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101822
Approved by: https://github.com/ezyang, https://github.com/albanD, https://github.com/bdhirsh
2023-05-23 18:31:27 +00:00
Richard Zou
4de5ee43bf [torch.library] Change Library.__del__ into weakref.finalize (#101829)
`__del__` is a bit difficult to use, because when it is called, it is
not guaranteed that anything it uses has not been cleaned up.

Ed tells me he got the following exception one day, which is what
prompted this PR.
```
Exception ignored in: <function Library.__del__ at 0x7fa36d211e50>
Traceback (most recent call last):
  File "/data/users/ezyang/a/pytorch/torch/library.py", line 139, in
  __del__
  AttributeError: 'NoneType' object has no attribute 'remove'
```

One solution is to use weakref.finalize, which lets one define a
function to be run when the object is deleted that can hold references
to specific things it needs.

Another solution is to just check if the object is None, but I like the
weakref solution better.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101829
Approved by: https://github.com/ezyang
2023-05-22 19:51:08 +00:00
Richard Zou
6bc0f4a4ee [reland][CustomOp] Add Dispatcher error callback (#101452)
Reland of #101015, original stack reverted due to internal test
flakiness.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101452
Approved by: https://github.com/soulitzer
2023-05-16 13:33:31 +00:00
Richard Zou
c8be493dac [reland][custom_op] Change the python type that maps to ListType in schema (#101451)
Reland of #101190. Original stack was reverted due to internal test
flakiness.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101451
Approved by: https://github.com/soulitzer
2023-05-16 13:33:31 +00:00
Richard Zou
4f8cbaa10a [reland] Cleanup custom op library after each custom_op test (#101450)
Reland of #100980. Original PR was reverted due to internal test
flakiness.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101450
Approved by: https://github.com/soulitzer
2023-05-16 13:33:29 +00:00
PyTorch MergeBot
7912b34789 Revert "[CustomOp] Add Dispatcher error callback (#101015)"
This reverts commit c0e5d7e7fe.

Reverted https://github.com/pytorch/pytorch/pull/101015 on behalf of https://github.com/huydhn due to Revert this as the earlier commits in the stack have been reverted ([comment](https://github.com/pytorch/pytorch/pull/101015#issuecomment-1548476583))
2023-05-15 19:49:53 +00:00
PyTorch MergeBot
349a2b3871 Revert "Cleanup custom op library after each custom_op test (#100980)"
This reverts commit d0d8165230.

Reverted https://github.com/pytorch/pytorch/pull/100980 on behalf of https://github.com/jeanschmidt due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/100980#issuecomment-1548336634))
2023-05-15 18:17:42 +00:00
PyTorch MergeBot
b50595702b Revert "[custom_op] Change the python type that maps to ListType in schema (#101190)"
This reverts commit de6470e28e.

Reverted https://github.com/pytorch/pytorch/pull/101190 on behalf of https://github.com/jeanschmidt due to preventing the revert of #100980 ([comment](https://github.com/pytorch/pytorch/pull/101190#issuecomment-1548332644))
2023-05-15 18:15:08 +00:00
Scott Wolchok
a8c32eb78e [PyTorch] add test for numel slow path affecting data_ptr (#100993)
This test would have stopped #98090 -- data_ptr needs to call custom Python numel if it exists, since it could be arbitrary Python.

Differential Revision: [D45701566](https://our.internmc.facebook.com/intern/diff/D45701566/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100993
Approved by: https://github.com/ezyang
2023-05-12 20:33:39 +00:00
Richard Zou
c0e5d7e7fe [CustomOp] Add Dispatcher error callback (#101015)
The PyTorch Dispatcher's "no kernel found for DispatchKey" error message
is a bit long and winded. This PR adds a way to add a custom error
callback and changes the CustomOp API to use the custom error callback
to deliver better error messages.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101015
Approved by: https://github.com/ezyang
2023-05-12 13:49:20 +00:00
Richard Zou
de6470e28e [custom_op] Change the python type that maps to ListType in schema (#101190)
Previously, to specify e.g. int[], a user needed to do Tuple[int, ...].
This PR changes it to Sequence[int].

Bikeshedding: we could totally just use List[int] instead. The types
that the user gives us that we use to infer a schema is not entirely
faithful: for example, we convert `int` to SymInt.

I didn't feel strongly between Sequence[int] and List[int] so I went
with the more faithful one, plus Python recommends that you use Sequence
for input arguments (over list or tuple), though we don't subscribe to
that philosophy in general.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101190
Approved by: https://github.com/bdhirsh
2023-05-12 13:49:20 +00:00
Richard Zou
d0d8165230 Cleanup custom op library after each custom_op test (#100980)
This PR tells the custom op tests to destroy all custom ops with
specified namespace after each test.

The general problem is that if a test fails, the custom op isn't cleaned
up. We could fix this via try-finally, but using a tearDown method
seemed like a nice O(1) solution.

Test Plan:
- deleted some foo._destroy, verified that the test suite passes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100980
Approved by: https://github.com/soulitzer, https://github.com/bdhirsh
2023-05-12 13:49:18 +00:00
Richard Zou
3ffeab7f80 [custom_op] Make repeated registrations error gracefully (#100979)
Previously the error message went through torch.library. This PR changes
it so that on each custom_op.impl_* call:
- we store a (function, location) tuple
- if a (function, location) tuple exists already, then we raise an
error.

This logic already existed for the abstract impl (the impl for meta and
fake tensors), so this PR just extends it to the others.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100979
Approved by: https://github.com/bdhirsh, https://github.com/soulitzer
2023-05-12 13:49:15 +00:00
Aaron Gokaslan
738ba13b35 [BE]: enable PLE error codes in ruff and fix bugs (#101079)
Enables PyLint error codes implemented in ruff. These are un-opinionated static analysis checks on Python code that finds common bugs. After running all the PLE error codes that are implemented in ruff, I fixed the bugs, added a few ignores for malformed Python code that is part of our JIT test script, and finally added a few ignores for a false positive on PLE0605 and submitted an issue upstream to fix in ruff https://github.com/charliermarsh/ruff/issues/4345 .

Common bugs found here include analysis for malformed logging format calls, bad string format calls, invalid escape sequences, and more.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101079
Approved by: https://github.com/malfet
2023-05-11 23:57:25 +00:00
Edward Z. Yang
ce1ad1c143 Add load_storage (#100519)
This adds a new operator debugprims::load_storage which does the unusual thing of loading a tensor from disk (via ContentStoreReader). This will be used in a later PR to implement delta debugging in the minifier, even when the repro is too big to fit into memory. The way it works is that you specify a name of the tensor you want to load, as well as enough metadata to reconstruct the tensor, if the store isn't available. If there is an active content store, we read and return the tensor from that store; otherwise we use `rand_strided` to create it.

I needed some infra improvements to do this:

* `custom_op` now supports factory functions. Factory functions have to be registered specially via `impl_factory`
* I modified `clone_input` to also support dtype conversion, which I use to change the dtype of a loaded tensor if necessary.
* ContentStore needs to work with a device argument, so we torch.load directly to the correct device. This is for fake tensor support.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100519
Approved by: https://github.com/zou3519, https://github.com/anijain2305
2023-05-05 05:25:03 +00:00
Richard Zou
1b84be551a Improved CustomOp API with schema inference (#100127)
This PR changes the CustomOp API. There are now two ways to create a
CustomOp object.

Method 1: with no schema string. We will infer what the schema string is
from your type annotations

```py
@custom_op("customlib::foo")
def foo(x: Tensor) -> Tensor:
    ...
```

Method 2: with a schema string, if the inference doesn't work well.

```py
@custom_op("customlib::foo", "(Tensor x) -> Tensor")
def foo(x):
    ...
```

Some details:
- We support most combinations of {Tensor, Number, int, float, bool} and
{Optional[typ], Tuple[typ, ...]} as inputs. The combinations we support are mostly
from me reading native_functions.yaml.
- We support only Tensor or Tuple of Tensor of fixed size returns.
- A lot of this PR is input validation for both of the above two
methods. For example, when a user provides a manual schema string, then
their function must not have any type annotations and the number of args
and arg names must match the schema.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100127
Approved by: https://github.com/ezyang
2023-04-28 16:53:07 +00:00
Richard Zou
7ebb60c9f4 [CustomOp] Fix lifetime semantics (#100114)
This PR makes a CustomOp live forever. The motivation for it living
forever is that:
1. It doesn't matter to a user if it lives forever or not
2. it is a higher-level abstraction over OpOverload, and OpOverload
assumes that OpOverload lives forever.

The only place where it matters that CustomOp lives forever is testing:
I don't want to generate random names for my CustomOp objects. To
resolve the testing problem, This PR adds a CustomOp._destroy() that
clears all the C++ state, including the OpOverloadPacket, that is
associated with the CustomOp object.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100114
Approved by: https://github.com/ezyang
2023-04-28 16:53:07 +00:00
Richard Zou
e6f9bc500b CustomOp simple abstract implementation registration (#99439)
This PR:
- adds an abstract registration API for CustomOp (CustomOp.impl_abstract)
that is used for both FakeTensor and meta tensors
- deletes CustomOp.impl_meta

The user story behind this API is that it is the one-stop shop for
registering implementations for data-less Tensors, i.e. FakeTensor and
Meta tensor.

The abstract implementation provided by the user:
- gets registered as the FakeTensor implementation AND the meta formula
- can be written like a regular meta formula. If the user decides that
they need something more special (i.e. data-dependent output shape),
then they are able to query a current context object (FakeTensorImplCtx)
that has methods to construct new unbacked symints.

Caveats:
- we really need to make FakeTensor/FakeTensorMode public. Otherwise,
there isn't a way for the user to interactively test that their abstract
implementation is correct without running through large pieces of the
PT2 stack (make_fx or torch.compile).
- We do not memoize the symints produced by
ctx.create_unbacked_symint(). It is possible to do this in the
future, but it is difficult to do soundly and I am not convinced of
the utility outside of the nonzero() usecase mentioned in #95399

Public API:
- More docs will come when we actually expose this API to users by
putting it in a public namespace, unless you folks want it now.
- The APIs mentioned in `__all__` are the ones that are intended to be
public.

Test Plan:
- Updated existing custom_op_db operators
- Added new numpy_nonzero and numpy_nms operations that test operations
that have data-dependendent output shape.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99439
Approved by: https://github.com/ezyang
2023-04-28 13:45:39 +00:00
Luca Wehrstedt
24bf15fe8d Support record_stream in dispatch mode (#99529)
Summary:
Issuing a `t.record_stream(s)` call while a `TorchDispatchMode` is active was throwing because PyTorch was unable to convert a c10::Stream back to a Python object. It's now fixed.

Fixes https://github.com/pytorch/pytorch/issues/94403

Test Plan: Added a unit test

Differential Revision: D45117566

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99529
Approved by: https://github.com/albanD
2023-04-21 07:17:19 +00:00
Richard Zou
44b09bf673 Reland "Simple Custom Operator API, V0 (#98440)" (#99416)
See the original PR (#98440) for the description. It broke internal
builds due to proxy_tensor.py not importing torch._dynamo, which is
being fixed in the previous PR in the stack.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99416
Approved by: https://github.com/soulitzer, https://github.com/bdhirsh
2023-04-18 23:48:33 +00:00
PyTorch MergeBot
f497031df9 Revert "Simple Custom Operator API, V0 (#98440)"
This reverts commit 0157b2d722.

Reverted https://github.com/pytorch/pytorch/pull/98440 on behalf of https://github.com/DanilBaibak due to Break internal build
2023-04-18 13:04:27 +00:00
Richard Zou
0157b2d722 Simple Custom Operator API, V0 (#98440)
This PR introduces CustomOp, a wrapper around a dispatcher operator that allows
users to define custom operators. It adds the skeleton for CustomOp and
some very simple behavior: as of this PR:
- one can create a CustomOp for an operator that does not have inplace or aliasing
- give it CPU/CUDA and Meta implementations
- and trace it into a graph via make_fx.

The design follows
https://docs.google.com/document/d/19Uc5OUCA187q9BZggJb70RT2ZoSTDoG5QQkJkZwd25M/edit
Concretely, we implement the following things mentioned in the doc in this PR:
- Entrypoint 1 (CustomOp.define, creating a new custom operator)
- impl (to define device-specific code) and impl_meta (to define meta
formulas)

The goal for the short term is to get the code to a state where it can be trialed
by the export folks. On top of this PR, the blockers are:
- adding Entrypoint 3 (CustomOp.from_existing)
- adding a way to do data-dependent shape formulas
These will come in future PRs since this one is getting long.

Things that will come in the longer-near-term (before 2.1):
- adding the other entrypoints mentioned in the doc (2 & 3)
- more safety checks and better error messages
- support for views and mutation
- support for defining autograd formulas
- support for functionalization
- making this API public (it's private right now).

Test Plan:
- added a new test case, TestCustomOp. It mostly tests a bunch of error
cases.
- added OpInfos for custom operators and hooked these up to
test_proxy_tensor to test that they work with make_fx. These custom
operators were based off of the ones in the autograd_function_db.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98440
Approved by: https://github.com/ezyang
2023-04-17 12:17:32 +00:00
Richard Zou
d5120ff18a [torch.library] Add ability to create library fragments (#98439)
In C++ we have TORCH_LIBRARY_FRAGMENT. This PR adds the same
functionality to the Python torch.library API.

The motivation for this is: for the simple custom op API, we don't want
users to need to deal with Library objects. One way to hide this from
users is to create library fragments.

Test Plan:
- tests that you can create multiple fragments and def+impl operators on each.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98439
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2023-04-10 18:04:53 +00:00
Richard Zou
618ea6fac3 Fix test_python_dispatch under debug mode (#98609)
The problem for these operators is that they were returning the input
directly as the output. This isn't support and will raise debug asserts.

Test Plan:
- Test locally. The debug build in CI doesn't actually do anything.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98609
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2023-04-10 18:04:53 +00:00
Shunting Zhang
a4b02a15d3 Support registering op returning symint in python (#95240)
Running an operator registered in python returning a symint will result in the following error:
```
RuntimeError: Unable to cast Python instance of type <class 'torch.SymInt'> to C++ type 'long'
```

The interaction of 2 things make the issue being triggered:
- We use boxed kernel here. For boxed kernel, we need convert py::object to IValue in torch/csrc/autograd/python_variable.cpp pushPyOutToStack .
- In the schema parsing code in torch/csrc/jit/frontend/schema_type_parser.cpp SchemaTypeParser::parseFakeAndRealType , if a SymInt is found, we register a Int type instead (not sure why we do this), and register SymInt as the real type.

The result is we would convert an SymInt to int in pushPyOutToStack and cause the issue.

The fix is to use real type when we convert py::object to IValue.

BTW, registering the same op using C++ API does not trigger the issue.
```
TORCH_LIBRARY(clib, m) {
  m.def("sqsum(SymInt a, SymInt b) -> SymInt", [](SymInt a, SymInt b) -> SymInt {
    return a * a + b * b;
  });
}
```
The reason is, the kernel registered in C++ is unboxed kernel and it does not trigger the code path above that converts an py::object to IValue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95240
Approved by: https://github.com/larryliu0820, https://github.com/ezyang
2023-02-22 04:56:37 +00:00
Aaron Gokaslan
748bac8757 [BE]: Apply pyupgrade yield from and unit test alias upgrades (#94309)
Applies some more harmless pyupgrades. This one gets rid of deprecated aliases in unit_tests and more upgrades yield for loops into yield from generators which are more performance and propagates more information / exceptions from original generator. This is the modern recommended way of forwarding generators.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94309
Approved by: https://github.com/albanD
2023-02-07 20:08:58 +00:00
Edward Z. Yang
434eb16deb Correctly restore pybind11 error_already_set (#93238)
We would handle py::error_already_set correctly from pybind11 bindings,
but not from our regular TH bindings, which meant that anything from
an inner pybind11 function call was getting unconditionally transformed
into a RuntimeError.  Not too many cases where we do this, but
PySymNodeImpl was one of them.

To test this, I need to raise a non-RuntimeError from a function which
is invoked from pybind11 and then propagated to a non-pybind11 call
site.  I introduce GuardOnDataDependentSymNode for expressly this
purpose (this is how I discovered the bug anyway.)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93238
Approved by: https://github.com/Skylion007, https://github.com/albanD
2023-01-30 16:43:01 +00:00
PyTorch MergeBot
1490dc6421 Revert "[BE] meow (#92174)"
This reverts commit 3debb97084.

Reverted https://github.com/pytorch/pytorch/pull/92174 on behalf of https://github.com/ezyang due to oh yeah i think the print is intentional graph break
2023-01-14 07:32:39 +00:00
Jane (Yuan) Xu
3debb97084 [BE] meow (#92174)
:')
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92174
Approved by: https://github.com/ezyang, https://github.com/Skylion007
2023-01-14 05:36:47 +00:00
PyTorch MergeBot
db466ae057 Revert "[Modes] Add assert that the mode isn't already on the stack (#90770)"
This reverts commit 702838637d.

Reverted https://github.com/pytorch/pytorch/pull/90770 on behalf of https://github.com/DanilBaibak due to Break internal build
2023-01-12 16:44:29 +00:00
samdow
702838637d [Modes] Add assert that the mode isn't already on the stack (#90770)
Redo of #89726 on a clean PR, thanks @voznesenskym for the first draft!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90770
Approved by: https://github.com/ezyang
2023-01-11 15:19:43 +00:00
Edward Z. Yang
66736ff425 Fix bug in OptionalTensorList (#88887)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88887
Approved by: https://github.com/anjali411
2022-11-12 02:19:46 +00:00
samdow
169ec120ef [Modes] refactor modes to only use a stack in cpp (#86458)
Refactors the mode code to only have the C++ mode stack and not the "C++ mode" like we originally had. This also simplifies the mode logic in a number of places
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86458
Approved by: https://github.com/zou3519
2022-10-21 19:18:23 +00:00
Edward Z. Yang
3b6588ab74 Consistent compute numel/contiguous strategy with SymInts (#85858)
Previously, our handling for contiguity was inconsistent in the following ways:

- is_strides_like 2d/3d and is_non_overlapping_and_dense always were computed
  based on sizes_and_strides_, even if you had symbolic ints
- Furthermore, even if you set custom policy for strides, these quantities were
  not overridable by subclasses
- Furthermore, we didn't even store these fields on ExtraMeta
- We duplicate implementations of compute_contiguous (plain, channels last,
  channels last 3d)
- We inconsistently called refresh_numel()/refresh_contiguous(), versus
  recomputing it ourselves

This factor makes a consistent strategy for all of the boolean fields, and
for numel computation.  After this refactor:

- All layout boolean fields are interposable via strides policy
  and can be overridden from Python; you will never access a garbage field
- All layout boolean fields are on ExtraMeta
- You can always call refresh_numel/contiguous, no matter if your Tensor is
  contiguous or not
- The numel/layout boolean fields are always populated consistently with
  the sizes strides fields (either on Tensor or ExtraMeta), even if you
  have custom policy
- There is only one implementation of the actual computation logic

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D39907696](https://our.internmc.facebook.com/intern/diff/D39907696)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85858
Approved by: https://github.com/albanD
2022-09-30 21:26:34 +00:00
samdow
18d8c548f4 [Modes] remove enable and rewrite mode stack (squashed) (#84774)
Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch|function}

This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily

Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup

### Background
Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like

```python
## PRE-PR UX
def f(mode):
  with mode.restore():  # user needs to understand this restore thing?
    ...

with Mode() as m:
  pass
f(m)
```

Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation"  step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write
```python
## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR
def f(mode):
  with mode:
    ...
f(Mode())
```

** Technical Details **
With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-09-27 01:04:35 +00:00
Horace He
90fa744c09 Fixed memory issues in linalg_lstsq (#85357)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85357
Approved by: https://github.com/ezyang, https://github.com/IvanYashchuk
2022-09-20 21:13:06 +00:00
Michael Voznesensky
8ca1839d32 Python Dispatcher integration with C++ dispatcher (#85050)
#84826 but without ghstack
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85050
Approved by: https://github.com/malfet
2022-09-15 00:43:36 +00:00
PyTorch MergeBot
706b990306 Revert "Python Dispatcher integration with C++ dispatcher (#84826)"
This reverts commit 35f6a69191.

Reverted https://github.com/pytorch/pytorch/pull/84826 on behalf of https://github.com/malfet due to Broke dynamo, see 35f6a69191
2022-09-14 14:07:58 +00:00
Michael Voznesensky
35f6a69191 Python Dispatcher integration with C++ dispatcher (#84826)
Signed-off-by: Edward Z. Yang <ezyangfb.com>

From @ezyang's original PR:

There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients:

We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation
The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch.
I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See https://github.com/pybind/pybind11/issues/483#issuecomment-1237418106 for how this works; the technique is generally useful.

I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84826
Approved by: https://github.com/ezyang
2022-09-14 06:57:19 +00:00
Edward Z. Yang
0491e1a13a Support returning symbolic strides from t.stride() in Python (#83842)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83842
Approved by: https://github.com/albanD, https://github.com/Chillee, https://github.com/bdhirsh
2022-08-24 04:32:51 +00:00
Brian Hirsh
0c24af4985 Always allow tensor metadata changes (#83590)
Make it so that it is valid to set metadata after detach calls, like `x.detach().resize_(...)`.

This technically lifts some restrictions around `.data`. This PR means that you can now technically call `x.data.resize_(...)`, which can now directly resize `x` instead of erroring.

My understanding: Before the tensor-variable merge, when `x` and `x.data` were really different tensors, you could resize `x.data` independently of `x`, and during the merge, this error was added to avoid silent confusing behavior changes.

It was agreed that this error has been around long enough (several years) that it's acceptable to drop.  cc @albanD @ezyang.

(Ed already had a prototype PR [here](https://github.com/pytorch/pytorch/pull/83545) - I ended up making one to try to slog through test failures).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83590
Approved by: https://github.com/ezyang
2022-08-19 23:30:43 +00:00
Edward Z. Yang
a3907ca92d Respect TorchDispatchMode for shallow_copy_and_detach (#83372)
I noticed I was missing tensor creations with modes when I tried
to delete proxy tensor.  This was the cause.

Hypothetically, all PyInterpreter calls could get this treatment.
But I think it only matters for detach; the rest do not return
Tensors and most modes will not be interested in them.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83372
Approved by: https://github.com/zou3519
2022-08-16 14:32:27 +00:00
PyTorch MergeBot
f534b2c627 Revert "Remove split functional wrapper (#74727)"
This reverts commit a58876ace7.

Reverted https://github.com/pytorch/pytorch/pull/74727 on behalf of https://github.com/seemethere due to Fails internal use cases, might extend out to external use cases as well. Need to assess overall impact of this change more widely
2022-08-10 19:45:23 +00:00
Peter Bell
a58876ace7 Remove split functional wrapper (#74727)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74727
Approved by: https://github.com/albanD, https://github.com/khabinov
2022-08-10 17:57:48 +00:00
Peter Bell
2c2278a960 Make python TensorOption signatures consistent with JIT schemas (#82241)
Fixes #81774

`TensorOptions` arguments in the JIT schema are optional, but in the Python API these were being translated to non-optional but with a default value. This change makes the arguments accept `None` for consistency with the JIT schema. However, it also means that `dtype=c10::nullopt` was previously completely untested so this also fixes several related bugs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82241
Approved by: https://github.com/ngimel
2022-08-07 00:10:27 +00:00
Nikolay Korovaiko
d2c47d559c Revert "Revert "Enabling SymInt in autograd; take 3 (#81145)"" ; make sure is_intlist checks for symintnodes (#82189)
### Description
<!-- What did you change and why was it needed? -->

### Issue
<!-- Link to Issue ticket or RFP -->

### Testing
<!-- How did you test your change? -->

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82189
Approved by: https://github.com/ezyang
2022-07-26 20:47:11 +00:00
Edward Z. Yang
563f6c7a9e Pass stride overload, not overload packet; add aten.stride.default (#82083)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82083
Approved by: https://github.com/albanD
2022-07-25 18:28:30 +00:00
samdow
2ac24675cc get rid of push_torch_{dispatch, function}_mode (#78215)
Currently we have 2 ways of doing the same thing for torch dispatch and function modes:
`with push_torch_dispatch_mode(X)` or `with X.push(...)`
is now the equivalent of doing
`with X()`

This removes the first API (which is older and private so we don't need to go through a deprecation cycle)

There is some risk here that this might land race with a PR that uses the old API but in general it seems like most are using the `with X()` API or `enable_torch_dispatch_mode(X())` which isn't getting removed.

EDIT: left the `with X.push(...)` API since there were ~3 land races with that over the past day or so. But made it give a warning and ask users to use the other API
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78215
Approved by: https://github.com/ezyang
2022-07-22 18:56:37 +00:00
Edward Z. Yang
fca03eeec1 Make proxy tensor support item() calls on torch.tensor constants (#81192)
This PR is doing a few interrelated things, all of which are necessary to get correctness. Read the comment in torch/fx/experimental/proxy_tensor.py for the high level overview.

Let's break down the parts of this PR:

* Bug fix where `enable_torch_dispatch_mode` with `None` doesn't work. This make `enable_torch_dispatch_mode(current_mode.inner)` work which is the basis for how we temporarily disable fake tensor mode.
* Bug fix for when fake tensor mode is combined with a non-mode tensor subclass. This actually could be ablated from this PR but it affects where the logic for allowing non fake tensor inputs with lift goes, so it's all in here in one go. There are some relevant tests for the fix in fake tensor, but it turns out I didn't need this because I'm always using proxy tensors as a mode (which ensures the ordering is right.)
* New `lift_fresh` view operator.  Note that like lift, we have to manually write the functionalize kernel for these functions.
* The actual change, which is to save constants when we see them in the proxy tensor mode, and then propagate them as we go (because otherwise you'll handle mutations on constants incorrectly--see test.)

This is mildly BC-breaking if anyone was previously interposing on
at::lift, but this operator was relatively new and I checked
functorch which has no explicit reference to lift.  So I think it
should not be too disruptive.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81192
Approved by: https://github.com/samdow, https://github.com/bdhirsh
2022-07-15 03:53:40 +00:00
lezcano
b5b9db9f84 Make kl_div a composite function. (#80334)
Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285

Fixes https://github.com/pytorch/pytorch/issues/80158
Fixes https://github.com/pytorch/pytorch/issues/78867
Fixes https://github.com/pytorch/pytorch/issues/69230

Supersedes https://github.com/pytorch/pytorch/pull/79007
Supersedes https://github.com/pytorch/pytorch/pull/69212
Supersedes https://github.com/pytorch/pytorch/pull/19659
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334
Approved by: https://github.com/ezyang
2022-07-13 20:07:36 +00:00
Edward Z. Yang
d4f065d261 Return mode object from __enter__ (#80998)
This makes `with Mode() as m:` work.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80998
Approved by: https://github.com/samdow
2022-07-12 23:22:26 +00:00
PyTorch MergeBot
7f3677d723 Revert "Remove split functional wrapper (#74727)"
This reverts commit cc3126083e.

Reverted https://github.com/pytorch/pytorch/pull/74727 on behalf of https://github.com/mehtanirav due to Breaking multiple internals builds and tests
2022-07-11 18:29:45 +00:00
Peter Bell
cc3126083e Remove split functional wrapper (#74727)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74727
Approved by: https://github.com/albanD
2022-07-08 19:21:22 +00:00
Nikolay Korovaiko
8389ccbcd8 reinstate size and shape returning symints (#79560)
This PR redirects `size` and `.shape` to call `sym_sizes`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79560
Approved by: https://github.com/Chillee
2022-07-08 01:17:33 +00:00
Edward Z. Yang
3ca309c4b8 Correctly setup ancestors on explicit push mode. (#80995)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80995
Approved by: https://github.com/Chillee
2022-07-07 02:01:12 +00:00
Edward Z. Yang
74877943b8 Don't invoke mode as overloaded argument in torch dispatch (#80992)
I noticed that in some situations torch dispatch modes were being
invoked with a mode active, which isn't supposed to happen (we
disable modes before calling into the user mode.)  I also noticed that
I was getting a warning that I had a deprecated non-static definition of
torch dispatch on an argument even though there wasn't any.

It turns out this is because modes were part of the overloaded arguments
list in the Python fallback kernel for torch dispatch.  This is wrong;
instead we should rely on the actual dispatching function to consult
modes.  This makes the code simpler.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80992
Approved by: https://github.com/zou3519
2022-07-06 23:45:59 +00:00
George Qi
393f7f6ad7 add layout to slow path (#80429)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80429
Approved by: https://github.com/ezyang
2022-07-06 18:01:31 +00:00
PyTorch MergeBot
f2c8557521 Revert "Make kl_div a composite function. (#80334)"
This reverts commit 828c787ea9.

Reverted https://github.com/pytorch/pytorch/pull/80334 on behalf of https://github.com/ezyang due to doesn't work with xla
2022-07-06 17:51:06 +00:00
anjali411
7f37b1b3e2 Disallow creating a library with prim namespace (#80913)
Fixes https://github.com/pytorch/pytorch/issues/77138
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80913
Approved by: https://github.com/ezyang
2022-07-06 14:31:24 +00:00
Nikolay Korovaiko
0a5123a752 Revert "Revert "Add support for directly passing symint to empty"" (#79954)
Relanding https://github.com/Krovatkin/pytorch/pull/new/krovatkin/symint_empty

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79954
Approved by: https://github.com/Chillee, https://github.com/kulinseth
2022-07-04 20:08:55 +00:00
lezcano
828c787ea9 Make kl_div a composite function. (#80334)
Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285

Fixes https://github.com/pytorch/pytorch/issues/80158
Fixes https://github.com/pytorch/pytorch/issues/78867
Fixes https://github.com/pytorch/pytorch/issues/69230

Supersedes https://github.com/pytorch/pytorch/pull/79007
Supersedes https://github.com/pytorch/pytorch/pull/69212
Supersedes https://github.com/pytorch/pytorch/pull/19659
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334
Approved by: https://github.com/ezyang
2022-07-04 19:33:43 +00:00
samdow
24243659e4 disable modes during constructor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79143

Approved by: https://github.com/ezyang
2022-06-17 22:28:27 +00:00
drisspg
b9f83cb737 use is_same_size in autograd init (#79553)
Broke: #79446 into a smaller commit that just adds is_same_size to the the autograd __init_file. This function is_same_size will be dispatched to the original behavior for regular tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79553
Approved by: https://github.com/soulitzer
2022-06-15 19:49:42 +00:00
George Qi
05624bcf7b add sizes to slowpath
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79295

Approved by: https://github.com/ezyang
2022-06-14 01:19:59 +00:00
samdow
5e926aafab add utils for checking that all modes are in the same scope and finding the outermost mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78847

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-10 19:31:05 +00:00
samdow
3734fcc8f8 add ability to push a mode if the current mode is an ancestor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78822

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-10 18:27:04 +00:00
George Qi
a90f006fe5 add strides to slow path
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78610

Approved by: https://github.com/ezyang
2022-06-10 16:59:14 +00:00
Edward Z. Yang
eb856daf0f Do not treat all dense tensors as isTensorSubclassLike
Fixes https://github.com/pytorch/pytorch/issues/79079

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79098

Approved by: https://github.com/soulitzer, https://github.com/albanD
2022-06-09 03:00:57 +00:00
Zachary DeVito
ab6c7b4b3f fix __torch_function__ bug in getindex that causes an error not set exception
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78781

Approved by: https://github.com/ezyang
2022-06-06 17:02:57 +00:00
samdow
184e0065b3 add better error message for class method
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78821

Approved by: https://github.com/ezyang
2022-06-06 13:31:32 +00:00
Michael Suo
22b10873f3 Allow torchdispatch to customize dim()
This follows the template in
https://github.com/pytorch/pytorch/pull/77396

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78691

Approved by: https://github.com/ezyang
2022-06-02 20:54:13 +00:00
anjali411
79ddc32b6a Add a check to ensure input func to Library.impl is callable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77990

Approved by: https://github.com/albanD
2022-06-02 16:55:39 +00:00
Michael Suo
876c359347 Generalize sizes and strides policy on _make_wrapper_subclass
Previously, there was a `dispatch_strides` boolean arg. Change this to
a string argument that directly maps onto `SizesStridesPolicy`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78646

Approved by: https://github.com/ezyang
2022-06-02 02:06:38 +00:00
samdow
aa06d05297 enable with semantics
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78214

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-01 21:14:45 +00:00
Elias Ellison
678213ead2 Fake Tensor Part 1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77969

Approved by: https://github.com/ezyang
2022-05-31 16:20:35 +00:00
soulitzer
f3af51069d Modernize LoggingTensorMode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77667

Approved by: https://github.com/malfet
2022-05-24 22:41:49 +00:00
Elias Ellison
2d93e1fada Add slow path for device
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77684

Approved by: https://github.com/ezyang
2022-05-24 21:56:01 +00:00
George Qi
294fff16ec add slow path for is_contiguous (#77906)
Test Plan: CI

Reviewed By: malfet, b0noI

Differential Revision: D36493890

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77906
Approved by: https://github.com/malfet
2022-05-19 22:52:45 +00:00
anjali411
5984bc8233 Allow specifying alias analysis while registering new ops
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77690

Approved by: https://github.com/ezyang
2022-05-19 21:11:40 +00:00
PyTorch MergeBot
00a187c373 Revert "add slow path for is_contiguous"
This reverts commit f6beda89c6.

Reverted https://github.com/pytorch/pytorch/pull/77396 on behalf of https://github.com/malfet
2022-05-19 17:07:54 +00:00
Edward Z. Yang
4941e72e40 Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836)""
This reverts commit c35bd8d423.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719

Approved by: https://github.com/Chillee, https://github.com/malfet
2022-05-18 18:40:57 +00:00
PyTorch MergeBot
48581d74ad Revert "Add dispatch mode testing for meta tensors and other stuff"
This reverts commit c1cdb1216b.

Reverted https://github.com/pytorch/pytorch/pull/77477 on behalf of https://github.com/malfet
2022-05-18 02:56:48 +00:00
George Qi
f6beda89c6 add slow path for is_contiguous
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77396

Approved by: https://github.com/ezyang, https://github.com/cpuhrsch
2022-05-18 02:25:27 +00:00
Edward Z. Yang
c1cdb1216b Add dispatch mode testing for meta tensors and other stuff
We don't have any coverage for meta tensor correctness for backwards
because torch function mode can only allow us to interpose on
Python torch API calls, but backwards invocations happen from C++.
To make this possible, I add torch_dispatch_meta test which runs the
tests with __torch_dispatch__

While doing this, I needed to generate fresh expected failure / skip
lists for the new test suite, and I discovered that my original
scaffolding for this purpose was woefully insufficient.  So I rewrote
how the test framework worked, and at the same time rewrote the
__torch_function__ code to also use the new logic.  Here's whats
new:

- Expected failure / skip is now done on a per function call basis,
  rather than the entire test.  This means that separate OpInfo
  samples for a function don't affect each other.

- There are now only two lists: expect failure list (where the test
  consistently fails on all runs) and skip list (where the test
  sometimes passes and fails.

- We explicitly notate the dtype that failed.  I considered detecting
  when something failed on all dtypes, but this was complicated and
  listing everything out seemed to be nice and simple.  To keep the
  dtypes short, I introduce a shorthand notation for dtypes.

- Conversion to meta tensors is factored into its own class
  MetaConverter

- To regenerate the expected failure / skip lists, just run with
  PYTORCH_COLLECT_EXPECT and filter on a specific test type
  (test_meta or test_dispatch_meta) for whichever you want to update.

Other misc fixes:

- Fix max_pool1d to work with BFloat16 in all circumstances, by making
  it dispatch and then fixing a minor compile error (constexpr doesn't
  work with BFloat16)

- Add resolve_name for turning random torch API functions into string
  names

- Add push classmethod to the Mode classes, so that you can more easily
  push a mode onto the mode stack

- Add some more skips for missing LAPACK

- Added an API to let you query if there's already a registration for
  a function, added a test to check that we register_meta for all
  decompositions (except detach, that decomp is wrong lol), and then
  update all the necessary sites to make the test pass.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77477

Approved by: https://github.com/zou3519
2022-05-18 00:18:34 +00:00
Edward Z. Yang
b5bc954a71 Fix optional dtype/layout/memory_format pycall; fix memory format
Double-header bug fix:

- As reported by jansel, dtypes are still showing up as integers
  when the schema is an optional dtype.  This is simple enough to
  fix and I added a test for it.  But while I was at it...

- I noticed that the THPMemoryFormat_new idiom with "unused" name
  doesn't actually work, the repr of the returned memory format
  object is wrong and this shows up when we try to log the args/kwargs.
  So I fixed memory format to do it properly along with everything
  else.

Fixes https://github.com/pytorch/pytorch/issues/77135

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77543

Approved by: https://github.com/albanD, https://github.com/jansel
2022-05-16 16:46:08 +00:00
Sherlockk Huang
61dcde88a6 Jiterator with Python Registration (#77121)
You can now do a lot of crazy things about redefining the behavior of an operator, and still be fast in cuda !!!

Example 1: swapping where's branches
```
code_string = "template <typename T> T inverted_where(bool cond, T a, T b){ return !cond ? a : b; }"
jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string)
my_lib = torch.library.Library("aten", "IMPL")
my_lib.impl('aten::where.self', jitted_fn, "CUDA")

# torch.where is now overridden
```
Example 2: approximate gelu with relu
```
code_string = "template <typename T> T fast_gelu(T a){ return a > 0 ? a : 0;}"
jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string)
my_lib = torch.library.Library("aten", "IMPL")
my_lib.impl('aten::gelu', jitted_fn, "CUDA")

# torch.nn.GELU and torch.nn.function.gelu are now overridden
```
Example 3: clipping output for numerical unstable kernels
```
code_string = "template <typename T> T clipped_exp(T a){ return a > T(10.0) ? T(22026.4657948) : exp(a); }"
jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string)
my_lib = torch.library.Library("aten", "IMPL")
my_lib.impl('aten::exp', jitted_fn, "CUDA")

# torch.exp(x) and x.exp() are now overridden
```
Example 4: Simulate buggy hardware behaviors
```
code_string = "template <typename T> T buggy_add(T a, T b){ return a + b + T(1); }"
jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string)
my_lib = torch.library.Library("aten", "IMPL")
my_lib.impl('aten::add.Tensor', jitted_fn, "CUDA")

torch.add(x, y), "x + y" and x.add(y) are now overridden
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77121
Approved by: https://github.com/anjali411
2022-05-10 20:54:23 +00:00
anjali411
767af8e335 Add meta tensor support for some operations using python registration
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76916

Approved by: https://github.com/ezyang
2022-05-10 17:55:06 +00:00
Edward Z. Yang
f2eed9400d Register PrimTorch refs as decompositions.
For the most part, PrimTorch refs have the same signature as their
ATen equivalents.  I modify most PrimTorch refs to register themselves
as decompositions, using the prim name they wrap to find the aten name
(except for a few cases where the prim/aten names mismatch).  There are
some exclusions, falling into one of two categories:

- The torch equivalent was already implemented as a CompositeImplicitAutograd
  decomposition in C++

- The ref doesn't support enough features (e.g., the real deal has more
  kwargs / overloads than are currently implemented)

PrimTorch refs are written as a single function that supports all
overloads, and this style is convenient for cases where we have a bundle
of overloads for what morally is a single overload with a Union type
on an argument (which we ought to have supported in
native_functions.yaml but blah); to support registering a single decomp
for all the overloads, we modify register_decomposition to register
to ALL overloads if you pass it an overload packet.  This is technically
BC breaking but no tests started failing because of it.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76835

Approved by: https://github.com/Chillee, https://github.com/mruberry
2022-05-06 20:11:45 +00:00
anjali411
07f766df54 Allow creating new libraries and defining new operators from Python
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76250

Approved by: https://github.com/ezyang
2022-05-05 03:33:08 +00:00
anjali411
55f55a4cf6 Allow users to override kernels for existing C++ ops through Python
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75905

Approved by: https://github.com/ezyang
2022-05-05 03:31:39 +00:00
samdow
6779366f27 add nested mode to python mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75965

Approved by: https://github.com/albanD, https://github.com/ezyang, https://github.com/zou3519
2022-05-04 13:01:06 +00:00
samdow
598e7e5f19 [Reland] Change 'python mode' to 'torch dispatch mode'
Changes Python Mode name to Torch Dispatch Mode because there is now a Torch Function Mode, so Torch Dispatch Mode and Torch Function Mode are consistent with each other
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76562
Approved by: https://github.com/zou3519, https://github.com/albanD
2022-05-02 20:06:43 +00:00
PyTorch MergeBot
395a620a4f Revert "Change 'python mode' to 'torch dispatch mode'"
This reverts commit 7203a73986.

Reverted https://github.com/pytorch/pytorch/pull/76562 on behalf of https://github.com/janeyx99
2022-05-02 14:42:11 +00:00
samdow
7203a73986 Change 'python mode' to 'torch dispatch mode'
Changes Python Mode name to Torch Dispatch Mode because there is now a Torch Function Mode, so Torch Dispatch Mode and Torch Function Mode are consistent with each other
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76562
Approved by: https://github.com/zou3519
2022-05-02 13:33:58 +00:00
albanD
cd0591dff3 Change default TLS behavior in dispatch to favor is-a style
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75827

Approved by: https://github.com/ezyang
2022-04-20 17:32:29 +00:00
Edward Z. Yang
2772870860 Preserve Python dispatch keys upon copy_tensor_metadata_except_version_counter
Whether or not this is a reasonable operation to do in the presence of
subclasses is a good question in and of itself, but this fixes an
obvious invariant violation, which is that if a Tensor reports that
it is a tensor subclass, it had better have the Python dispatch key.
Previously, the dispatch key would have gotten unconditionally cleared;
now we preserve what ever the original bit was.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75644

Approved by: https://github.com/albanD
2022-04-15 13:26:23 +00:00