Commit Graph

63 Commits

Author SHA1 Message Date
PyTorch MergeBot
b2f09c1859 Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681)"
This reverts commit d27509c384.

Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/xmfan due to breaking internal builds, see D54707287 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1989542344))
2024-03-11 22:18:36 +00:00
Simon Fan
d27509c384 [compiled autograd] support custom ops backed by c++ autograd::Function (#120681)
- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports

limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
2024-03-08 20:43:29 +00:00
PyTorch MergeBot
2b1661c7a0 Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681)"
This reverts commit 05c256849b.

Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/izaitsevfb due to breaking internal builds, see D54617701 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1984214079))
2024-03-07 18:53:51 +00:00
Simon Fan
05c256849b [compiled autograd] support custom ops backed by c++ autograd::Function (#120681)
- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports

limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
2024-03-06 18:01:56 +00:00
Catherine Lee
b3a9d677a3 [ez] Add super() calls in test_custom_ops (#121239)
Some disable issues are getting spammed
Check that test_impl_invalid_devices gets skipped by the disable issue
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121239
Approved by: https://github.com/zou3519
2024-03-05 21:16:06 +00:00
Simon Fan
d08ce51881 [compiled autograd] refactor eager test loading and run custom ops tests (#120679)
TestCustomOp's tests uses helper attributes and functions from a util parent class. To support arbitrary test classes, we need to refactor the current approach. Instead of allowlisting certain methods, we can instead copy the whole class and only overwrite the "test_.*" methods.

Compiled autograd fails on ~10/90 of the newly added tests. test_autograd_function_backed_op is the example we discussed in PT-2D meeting about requiring c++ autograd::Function support. I'm addressing this in #120732

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120679
Approved by: https://github.com/jansel, https://github.com/zou3519
2024-03-01 22:48:17 +00:00
atalman
244b124bb8 Add linux cpu test for 3.12 (#117853)
This is continuation of work: https://github.com/pytorch/pytorch/pull/113987

Co-authored-by: albanD <desmaison.alban@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117853
Approved by: https://github.com/albanD
2024-02-14 20:52:23 +00:00
Sergii Dymchenko
bd9db6a9c7 Update to TorchFix 0.4.0 (#119424)
`torch.library.Library` updated to `torch.library._scoped_library` in files with many tests where it seems obvious to do, otherwise `noqa: TOR901` added - see https://github.com/pytorch/pytorch/pull/118318 for more context.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119424
Approved by: https://github.com/zou3519
2024-02-12 23:30:12 +00:00
Edward Z. Yang
0249c4a785 Add config toggle suggestions for data-dependent/dynamic output shape (#114337)
Fixes https://github.com/pytorch/pytorch/issues/114220

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114337
Approved by: https://github.com/aakhundov
2024-01-05 14:01:01 +00:00
youkaichao
16373bbc1f fix error message in pytorch (#115349)
Fixes https://dev-discuss.pytorch.org/t/typo-in-error-message/1709 .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115349
Approved by: https://github.com/Skylion007
2023-12-07 19:27:29 +00:00
rzou
b694f88ef6 Grandfather in built-in TorchScript ops to being pt2_compliant (#113061)
I'm seeing ops like torch.ops.aten.mul.complex being used with
torch.compile (though this seems strange to me), but we should
grandfather these in.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113061
Approved by: https://github.com/ezyang
ghstack dependencies: #113050
2023-11-09 02:35:33 +00:00
PyTorch MergeBot
d98182e34e Revert "Grandfather in built-in TorchScript ops to being pt2_compliant (#113061)"
This reverts commit 493b52b3d9.

Reverted https://github.com/pytorch/pytorch/pull/113061 on behalf of https://github.com/PaliC due to breaking internal tests - contacted author with errors ([comment](https://github.com/pytorch/pytorch/pull/113061#issuecomment-1802528592))
2023-11-08 19:36:41 +00:00
Richard Zou
d1c092ae1b Update impl_abstract_pystub to be less boilerplatey (#113182)
Summary:

We've made the following changes:
- The new way to use the API is `m.impl_abstract_pystub(module, context)`.
  Every subsequent m.def of an op inside the TORCH_LIBRARY block gives
  the op the `impl_abstract_pystub`.
- Added a mechanism to determine if an operator was defined in Python or C++.
  Library.define in Python appends the op to a global set, which is analogous
  to what we do for tracking Library.impl.
- If someone does `torch.library.impl_abstract` in Python for an operator, then
  we require that it has an `impl_abstract_pystub` specified and we also check
  that the module in the `impl_abstract_pystub` is the same as the module where
  the call to `torch.library.impl_abstract` exists.
- Unfortunately we can't check the "context" (which is the buck target on
  buck-based systems) because buck sits above us.

bypass-github-export-checks

Test Plan: - existing tests

Differential Revision: D51080493

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113182
Approved by: https://github.com/ezyang
2023-11-08 00:39:00 +00:00
PyTorch MergeBot
bc3e2e03cd Revert "Update impl_abstract_pystub to be less boilerplatey (#112851)"
This reverts commit 6ae4e3a8d2.

Reverted https://github.com/pytorch/pytorch/pull/112851 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/112851#issuecomment-1799539354))
2023-11-07 18:53:13 +00:00
Richard Zou
6ae4e3a8d2 Update impl_abstract_pystub to be less boilerplatey (#112851)
Summary:
We've made the following changes:
- The new way to use the API is `m.impl_abstract_pystub(module, context)`.
  Every subsequent m.def of an op inside the TORCH_LIBRARY block gives
  the op the `impl_abstract_pystub`.
- Added a mechanism to determine if an operator was defined in Python or C++.
  Library.define in Python appends the op to a global set, which is analogous
  to what we do for tracking Library.impl.
- If someone does `torch.library.impl_abstract` in Python for an operator, then
  we require that it has an `impl_abstract_pystub` specified and we also check
  that the module in the `impl_abstract_pystub` is the same as the module where
  the call to `torch.library.impl_abstract` exists.
- Unfortunately we can't check the "context" (which is the buck target on
  buck-based systems) because buck sits above us.

Test Plan: - existing tests

Differential Revision: D50972148

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112851
Approved by: https://github.com/ezyang
2023-11-07 16:07:42 +00:00
rzou
493b52b3d9 Grandfather in built-in TorchScript ops to being pt2_compliant (#113061)
I'm seeing ops like torch.ops.aten.mul.complex being used with
torch.compile (though this seems strange to me), but we should
grandfather these in.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113061
Approved by: https://github.com/ezyang
ghstack dependencies: #113049, #113050
2023-11-07 12:55:16 +00:00
PyTorch MergeBot
d94d72b397 Revert "Grandfather in built-in TorchScript ops to being pt2_compliant (#113061)"
This reverts commit 1d4d5e4319.

Reverted https://github.com/pytorch/pytorch/pull/113061 on behalf of https://github.com/clee2000 due to something in the stack broke distributed and inductor, pretty sure its the c10 one.  Not sure why so many things were flaky on this PR ([comment](https://github.com/pytorch/pytorch/pull/113061#issuecomment-1797251293))
2023-11-07 02:28:14 +00:00
rzou
1d4d5e4319 Grandfather in built-in TorchScript ops to being pt2_compliant (#113061)
I'm seeing ops like torch.ops.aten.mul.complex being used with
torch.compile (though this seems strange to me), but we should
grandfather these in.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113061
Approved by: https://github.com/ezyang
ghstack dependencies: #113036, #113049, #113050
2023-11-06 23:43:31 +00:00
rzou
71dca16610 Grandfather autogen'ed ops as pt2_compliant (#113036)
Summary:
I missed this when I grandfathered torchgen'ed aten ops as pt2_compliant.

Test Plan:
New test.

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113036
Approved by: https://github.com/williamwen42
2023-11-06 23:43:17 +00:00
PaliC
542fa4a2e7 Revert "Revert "Use OpOverload instead of OpOverloadPacket for size/s… (#113058)
Revert "Revert "Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119)""

This reverts commit a1d1b73a7c.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113058
Approved by: https://github.com/izaitsevfb
2023-11-06 19:38:49 +00:00
PyTorch MergeBot
a1d1b73a7c Revert "Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119)"
This reverts commit 2337d8d062.

Reverted https://github.com/pytorch/pytorch/pull/112119 on behalf of https://github.com/PaliC due to still breaking trt tests :( refer to diff ([comment](https://github.com/pytorch/pytorch/pull/112119#issuecomment-1795496395))
2023-11-06 17:01:50 +00:00
Richard Zou
185515368b Add generated opcheck test for if the pt2_compliant_tag is incorrectly applied (#112759)
Summary:
If there are xfails in the failures_dict and the operator has the
pt2_compliant_tag, then we raise an error. These generated tests are separate
from those in the failures dict because we don't actually need any sample
inputs to check this.

Test Plan: - New tests

Differential Revision: D50936201

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112759
Approved by: https://github.com/ezyang
2023-11-06 13:45:35 +00:00
Edward Z. Yang
2337d8d062 Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112119
Approved by: https://github.com/yanboliang
2023-11-03 13:54:41 +00:00
PyTorch MergeBot
25e17f3522 Revert "Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119)"
This reverts commit dd24e92949.

Reverted https://github.com/pytorch/pytorch/pull/112119 on behalf of https://github.com/ZainRizvi due to Breaking internal tests. See D50912326 ([comment](https://github.com/pytorch/pytorch/pull/112119#issuecomment-1791072363))
2023-11-02 16:32:25 +00:00
Edward Z. Yang
dd24e92949 Use OpOverload instead of OpOverloadPacket for size/stride/etc slots (#112119)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112119
Approved by: https://github.com/yanboliang
2023-11-01 18:26:01 +00:00
rzou
ae72607e5f Add way to determine which overload an OpOverloadPacket will resolve to (#112199)
The types are a bit weird (we accept and return a string) because there
is not really a notion of OpOverloadPacket vs OpOverload in C++.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112199
Approved by: https://github.com/ezyang
ghstack dependencies: #112198
2023-10-29 15:36:14 +00:00
Richard Zou
bd0ea72b28 torch.library: Create helper function is_functional_schema (#111660)
I will need this again soon.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111660
Approved by: https://github.com/soulitzer
2023-10-27 15:20:25 +00:00
rzou
d91a18c433 Grandfather in torchgen'ed aten ops to torch.Tag.pt2_compliant_tag (#112053)
In torchgen, we add the pt2_compliant_tag to all aten ops.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112053
Approved by: https://github.com/soulitzer
2023-10-26 21:21:09 +00:00
rzou
3219b728b6 [torch.library] Clarify torch.library.define's schema (#111915)
Unlike the previous torch.library.define, this schema doesn't take a
name (the name is a part of the qualname). We separated out the qualname
from the schema in the new APIs so that they're all consistent with each
other (they all accept the qualname separately).

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111915
Approved by: https://github.com/suo, https://github.com/ezyang
ghstack dependencies: #111912
2023-10-25 21:20:54 +00:00
rzou
2d04be9a00 [torch.library] Add mechanism to add tags during define (#111912)
We extend torch.library.Library.define and torch.library.define
with a tags argument.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111912
Approved by: https://github.com/ezyang
2023-10-25 21:20:48 +00:00
Richard Zou
66b74d231a Change torch.library.impl to accept a device string (#111659)
torch.library.impl now accepts a device string (e.g. "cpu", "cuda"). It
still accepts DispatchKey strings, but we no longer document this, because
using arbitrary DispatchKeys is more for the power users.

We map the device string to a DispatchKey and then register the impl for
said DispatchKey. A user may also specify multiple device strings at once
or specify "types=default" to get a CompositeExplicitAutograd registration.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111659
Approved by: https://github.com/soulitzer
ghstack dependencies: #111380
2023-10-23 23:02:41 +00:00
Richard Zou
afb4914c3d Align torch.library.impl with the new torch.library style (#111308)
We add a new overload to torch.library.impl that accepts an optional
Library arg. If provided, the lifetime of the registration will be
tied to the Library arg, otherwise, it will live forever.

Test Plan:
- existing and new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111308
Approved by: https://github.com/soulitzer
ghstack dependencies: #111307
2023-10-16 22:32:23 +00:00
Richard Zou
9d9cc67592 Make torch.library.define consistent with the new APIs (#111307)
This PR introduces a new overload of torch.library.define. Like
impl_abstract, and our plans for the rest of the torch.library APIs, we
allow it to accept an optional library object to tie the lifetime of the
op definition to.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111307
Approved by: https://github.com/soulitzer, https://github.com/ezyang
2023-10-16 22:32:23 +00:00
rzou
2cf9782912 [generate_opcheck_tests] Add some reasonable defaults (#110977)
Summary:
Make it easier to add `generate_opcheck_tests` by adding defaults for
the failures_dict location, the additional decorators, and the test
utils.

Test Plan:
Existing tests

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110977
Approved by: https://github.com/williamwen42
ghstack dependencies: #110951
2023-10-11 14:28:05 +00:00
rzou
3a29cdc5e6 [optests] Add dontGenerateOpCheckTests and is_inside_opcheck_mode (#110951)
This PR adds the following helper functions for generated opcheck tests:
- dontGenerateOpCheckTests is a decorator that skips generation of the
  opcheck tests for the generated function
- is_inside_opcheck_mode lets us query if we are in a generated test.
  Useful for fast debugging out-of-tree without needing to update
  PyTorch.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110951
Approved by: https://github.com/williamwen42
2023-10-10 21:43:43 +00:00
rzou
1d0a8eed5d [generate_opcheck_tests] Enable using same failures_dict for multiple testclasses (#110164)
This PR allows us to use the same failures_dict for multiple test
classes. This is helpful if you have a bunch of small TestCase(es) and
to centralize all the failures dict into one big one.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110164
Approved by: https://github.com/williamwen42
2023-09-28 17:56:45 +00:00
Richard Zou
bb9779ecd2 Revert D49640259: Revert D49615962: [optests] Test names in failure dicts should be prefixed with test class (#110094)
Summary: Revert D49640259: Revert D49615962: [optests] Test names in failure dicts should

Test Plan: revert-hammer

Differential Revision: D49645397

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110094
Approved by: https://github.com/izaitsevfb
2023-09-26 21:16:36 +00:00
PyTorch MergeBot
2393864070 Revert "[optests] Test names in failure dicts should be prefixed with test class (#110045)"
This reverts commit 76fcec74c4.

Reverted https://github.com/pytorch/pytorch/pull/110045 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/110045#issuecomment-1735711094))
2023-09-26 14:56:08 +00:00
rzou
ea20db8aa0 [optests] Excise unused operator_compile_check (#110011)
The recommendation is to just use `opcheck`, which has superceded all
uses of `operator_compile_check`.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110011
Approved by: https://github.com/ezyang
ghstack dependencies: #109912
2023-09-26 13:24:21 +00:00
rzou
76fcec74c4 [optests] Test names in failure dicts should be prefixed with test class (#110045)
We want to use the same failures dict for multiple TestCase. This happens
common in e.g. fbgemm. To move towards that, we need to prefix each test name
with their test class to avoid ambiguity

Differential Revision: [D49615962](https://our.internmc.facebook.com/intern/diff/D49615962/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110045
Approved by: https://github.com/williamwen42
2023-09-26 03:21:12 +00:00
rzou
f8fcc54f70 Add torch.library.impl_abstract (#109912)
Changelog:
- torch.library.impl_abstract optionally accepts a torch.library.Library
  object. If passed in, then the lifetime of the registration is tied to
  the Library object.
- we've also changed torch.library.impl_abstract to work on all
  operators, including overloads.
- we refactored the `torch._custom_ops.*` and `torch._custom_op.*`
  impl_abstract APIs and put them under torch._library. This is the
  final resting place for them. I will follow-up with deleting
  all the `torch._custom_ops.*` stuff later.
- There is a new "SimpleOperatorRegistry" where we actually collect the
  abstract_impl. We will expand this to also hold the other
  torch._custom_ops.* APIs when we move those to torch.library

NB: Previously we had designed
`impl_abstract` assuming a very high-level Python-only custom op API.
We've revisited that since; now, impl_abstract works for all custom ops,
no matter python or C++, no matter the schema. The new refactored design
reflects this better.

Test Plan:
- existing and new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109912
Approved by: https://github.com/ezyang
2023-09-26 01:59:50 +00:00
rzou
8124a6c40c [TORCH_LIBRARY] Add impl_abstract_pystub (#109529)
We want users to be able to define custom ops in C++ but put the
abstract impl in Python (since it is easier to write them in Python and
the abstract impl better models device semantics and data-dependent
operators).

`m.impl_abstract_pystub(opname, python_module, context)` declares the
abstract_impl of the operator to exist in the given python module.
When the abstract_impl needs to be accessed (either via FakeTensor or
Meta), and it does not exist, the PyTorch Dispatcher will yell
with a descriptive error message.

Some details:
- We construct a new global AbstractImplPyStub mapping in
  Dispatcher.cpp. Read/write to this map is protected by the Dispatcher
  lock.
- We add a new Meta Tensor fallback kernel. The fallback errors out if there is
  no meta kernel, but also offers a nicer error message if we see that there is
  a pystub.
- We create a `torch._utils_internal.throw_abstract_impl_not_imported_error`
  helper function to throw errors. This way, we can throw different error
  messages in OSS PyTorch vs internal PyTorch. To invoke this from C++, we
  added a PyInterpreter::throw_abstract_impl_not_imported_error.

Differential Revision: [D49464753](https://our.internmc.facebook.com/intern/diff/D49464753/)

Differential Revision: [D49464753](https://our.internmc.facebook.com/intern/diff/D49464753)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109529
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2023-09-22 04:55:36 +00:00
rzou
122264a0c0 [generate_opcheck_tests] tests should ignore meta/FakeTensors (#109641)
These tests generally don't work on meta tensors because they need to
compare the data of the Tensors. For example, SchemaCheckMode errors out
if any inputs are meta or Fake because it needs to check their storages
to see if any mutation occurred and those do not have storages.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109641
Approved by: https://github.com/bdhirsh, https://github.com/soulitzer
ghstack dependencies: #109637, #109638, #109639, #109640
2023-09-20 06:33:37 +00:00
rzou
d3d71367b9 [generate_opcheck_tests] Always print a repro (#109640)
On failure of a test, we will always print a "repro". This repro isn't
really runnable but gives the user a sense of how to actually reproduce
the test without the test suite, because using the test suite is a bit
convoluted.

If the user passes PYTORCH_OPCHECK_PRINT_BETTER_REPRO, we will print a
fuller repro that saves the exact problematic test inputs to disk and
reads them back out.

Test Plan:
- expecttests on the generate_repro helper function
- tried this out locally.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109640
Approved by: https://github.com/bdhirsh, https://github.com/soulitzer
ghstack dependencies: #109637, #109638, #109639
2023-09-20 06:33:37 +00:00
rzou
10d575911e [generate_opcheck_tests] rename "success" to "xsuccess" (#109637)
Not BC breaking because no existing failures dict have "success" in
them.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109637
Approved by: https://github.com/bdhirsh, https://github.com/soulitzer
2023-09-20 06:33:37 +00:00
ydwu4
94a54b89aa [dynamo] Add BACKEND_MATCH guard to detect and recompile when backend changes (#107337)
**Motivation:**
We try to make torch.cond use torch.compile automatically so that we could error out when there is side-effects in the branches and correctly handle the closures.

Before this PR, we have a warning if we don't turn on a config raise_on_backend_change (turning it on gives us an error) for the following code:
```python
def foo()

# Inside torch.cond, we'd like to do something like
torch.compile(foo, backend="eager", fullgraph=True)(...)
...
# Users may then call torch.compile somewhere else.
# Dynamo will use the cached code of foo for "eager" backend
# but we expect dynamo to recompile with "inductor" backend.
torch.compile(foo, backend="inductor")(...)
```

This PR adds a BACKEND_MATCH guard. Effectively, it implements a per-backend cache. In the above example, the cached code for "eager" won't work for "inductor" due to guard check failures and the second torch.compile will do a re-compilation. In the future, it might be useful to have something like a configuration guard that guards against dynamo configuration changes across different compiles (e.g. compile a function with fullgraph=False then compile it again with fullgraph=True).

**Implementation:**
1. We add a guarded_backend_cache and check the most_recent_backend against the backend associated with cached code. We also remove the raise_on_backend_change flag.

Note: More lines are printed for debug log due to newly added context manager and guard adds .

**Test Plan:**
Removed original tests that raise on different backend and add a new test to test whether the BACKEND_MATCH guard can guard against backend change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107337
Approved by: https://github.com/jansel
2023-09-14 15:49:30 +00:00
Edward Z. Yang
55f956f1d2 optests improvements based on torchvision usage on nms (#108929)
- Update cross-ref FakeMode test to use ShapeEnv.  Dynamic ops can now
  return an unbacked SymInt.  We always accept this as equal to whatever
  the real value was.
- Relax test so it works on all classes, not just unittest.TestCase
- Properly wrap the original method, so things like
  pytree.mark.parametrize are carried over
- Support dynamic shapes by default for make_fx `tracing_mode="fake"` without symbolifying everything else

Fixes https://github.com/pytorch/pytorch/issues/108927

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108929
Approved by: https://github.com/zou3519
2023-09-13 13:26:15 +00:00
Richard Zou
bfa8429c6a [optests] Changed failures_dict format to json; automatic update of failures_dict (#109110)
We changed the failures_dict format from .py to json and added a way to
automatically update the failures dict (the user can set
PYTORCH_OPCHECK_ACCEPT=1 to do so), assuming the tests don't crash in the
process.

Some details:
- We introduced a FailuresDict class that handles save/load and from which one
can query a test status ("xfail", "skip", etc).
- PYTORCH_OPCHECK_ACCEPT=1 does not override everything. In particular: it
doesn't try to update the failures dict for a test marked as "skip", but it
will update it for tests marked as "xfail" or "success".
- PYTORCH_OPCHECK_ACCEPT=1 also does not override the "comment" field, unless
it is flipping an "xfail" into "success".
- I'll update the gdoc linked in the comments with how to actually use
PYTORCH_OPCHECK_ACCEPT=1 internally (it's not trivial).

Note that this isn't multithreading-safe, the current recommendation is to run
the tests sequentially if the user wants to use PYTORCH_OPCHECK_ACCEPT=1.

Differential Revision: D49167181

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109110
Approved by: https://github.com/ezyang
2023-09-13 13:24:15 +00:00
PyTorch MergeBot
38fcf77a1b Revert "[dynamo] Add BACKEND_MATCH guard to detect and recompile when backend changes (#107337)"
This reverts commit 1a64ec7dd4.

Reverted https://github.com/pytorch/pytorch/pull/107337 on behalf of https://github.com/huydhn due to Sorry for reverting your change but inductor perf smoke test starts to regress after this ([comment](https://github.com/pytorch/pytorch/pull/107337#issuecomment-1710974588))
2023-09-08 02:03:48 +00:00
ydwu4
1a64ec7dd4 [dynamo] Add BACKEND_MATCH guard to detect and recompile when backend changes (#107337)
**Motivation:**
We try to make torch.cond use torch.compile automatically so that we could error out when there is side-effects in the branches and correctly handle the closures.

Before this PR, we have a warning if we don't turn on a config raise_on_backend_change (turning it on gives us an error) for the following code:
```python
def foo()

# Inside torch.cond, we'd like to do something like
torch.compile(foo, backend="eager", fullgraph=True)(...)
...
# Users may then call torch.compile somewhere else.
# Dynamo will use the cached code of foo for "eager" backend
# but we expect dynamo to recompile with "inductor" backend.
torch.compile(foo, backend="inductor")(...)
```

This PR adds a BACKEND_MATCH guard. Effectively, it implements a per-backend cache. In the above example, the cached code for "eager" won't work for "inductor" due to guard check failures and the second torch.compile will do a re-compilation. In the future, it might be useful to have something like a configuration guard that guards against dynamo configuration changes across different compiles (e.g. compile a function with fullgraph=False then compile it again with fullgraph=True).

**Implementation:**
1. We add a guarded_backend_cache and check the most_recent_backend against the backend associated with cached code. We also remove the raise_on_backend_change flag.

2. Then newly added context manager and guard adds more lines for debug log so we change the uppper limit from 50 to 55.

**Test Plan:**
Removed original tests that raise on different backend and add a new test to test whether the BACKEND_MATCH guard can guard against backend change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107337
Approved by: https://github.com/jansel
2023-09-07 22:45:54 +00:00