pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Richard Zou	81cc9bba5e	[autograd.Function] Kill the extension feature flag (#92026 ) This PR removes the autograd.Function extension feature flag. This was previously used for development of the functorch <> autograd.Function interaction. It's been in master for long enough with the feature flag defaulting to True, so it's time to remove it. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/92026 Approved by: https://github.com/soulitzer	2023-01-17 13:36:42 +00:00
Richard Zou	2f9166ef89	[autograd.Function] Cleanup asymmetry in generate_vmap_rule and vmap (#91787 ) This PR: - changes generate_vmap_rule to either be True or False. Previously it could be True, False, or not set. This simplifies the implementation a bit. - changes the vmap staticmethod to always be on the autograd.Function rather than sometimes defined. This is how the other staticmethod (forward, backward, jvp) are implemented and allows us to document it. There are 4 possible states for the autograd.Function w.r.t. to the above: - generate_vmap_rule is True, vmap staticmethod overriden. This raises an error when used with vmap. - generate_vmap_rule is False, vmap staticmethod overriden. This is valid. - generate_vmap_rule is True, vmap staticmethod not overriden. This is valid. - generate_vmap_rule is False, vmap staticmethod not overriden. This raises an error when used with vmap. Future: - setup_context needs the same treatment, but that's a bit tricker to implement. Test Plan: - new unittest - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/91787 Approved by: https://github.com/soulitzer	2023-01-17 13:36:34 +00:00
Edward Z. Yang	333540a458	Reland "Add torch.utils.device_mode" (#91796 ) Original PR https://github.com/pytorch/pytorch/pull/91525 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796 Approved by: https://github.com/albanD	2023-01-09 20:57:12 +00:00
PyTorch MergeBot	9b415240d4	Revert "Reland "Add torch.utils.device_mode" (#91796 )" This reverts commit `81b5eff3c3`. Reverted https://github.com/pytorch/pytorch/pull/91796 on behalf of https://github.com/huydhn due to This breaks trunk with the following failed test https://hud.pytorch.org/failure/test_jit_save%2CTestTracer	2023-01-09 04:45:47 +00:00
Edward Z. Yang	81b5eff3c3	Reland "Add torch.utils.device_mode" (#91796 ) Original PR https://github.com/pytorch/pytorch/pull/91525 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796 Approved by: https://github.com/albanD	2023-01-08 03:44:56 +00:00
Richard Zou	f012d0ea5b	[autograd.Function] enable the extended Function feature flag by default (#91441 ) The autograd.Function <> functorch interaction is in a mostly completed state now. There are some minor action items remaining (https://github.com/pytorch/pytorch/issues/90224), but I want to enable the feature by default so that PyTorch CI / other parties / etc can begin testing to see if there is any impact on the original autograd.Function API (there shouldn't be). The longer-term plan for the feature flag is: - keep it around until at least the next release (so that people can turn off the feature if it breaks something in existing code) - delete the flag then (either before or after the release, I haven't decided yet) Test Plan: - new test - wait for CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/91441 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-12-28 21:00:27 +00:00
soulitzer	1b2ee4d0e1	Update functorch supported autograd.Function to allow mark_dirty (#91222 ) Fixes https://github.com/pytorch/pytorch/issues/90225 Uses what was originally in `32a57bcdb6` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91222 Approved by: https://github.com/zou3519	2022-12-28 03:53:47 +00:00
Huy Do	e40e4d36c9	Fix test_profiler_seq_nr flakiness (on macos) (#91019 ) Fixes https://github.com/pytorch/pytorch/issues/66893 On MacOS, two `aten::sum` calls are reported sometimes where there should be only one. This can be easily reproduced by running `pytest test_autograd.py -k test_profiler_seq_nr --verbose --flake-finder` to see the flakiness. The profile result when the test fails is as follows (sorted by CPU): ``` ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ Name Self CPU % Self CPU CPU total % CPU total CPU time avg # of Calls ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ aten::randn 16.67% 3.000us 27.78% 5.000us 2.500us 2 aten::sum 16.67% 3.000us 27.78% 5.000us 2.500us 2 aten::normal_ 11.11% 2.000us 11.11% 2.000us 1.000us 2 aten::add 11.11% 2.000us 11.11% 2.000us 2.000us 1 autograd::engine::evaluate_function: torch::autograd... 11.11% 2.000us 27.78% 5.000us 2.500us 2 torch::autograd::AccumulateGrad 11.11% 2.000us 16.67% 3.000us 1.500us 2 aten::ones_like 5.56% 1.000us 5.56% 1.000us 1.000us 1 autograd::engine::evaluate_function: SumBackward0 5.56% 1.000us 11.11% 2.000us 2.000us 1 aten::expand 5.56% 1.000us 5.56% 1.000us 1.000us 1 aten::copy_ 5.56% 1.000us 5.56% 1.000us 0.500us 2 aten::empty 0.00% 0.000us 0.00% 0.000us 0.000us 2 aten::as_strided 0.00% 0.000us 0.00% 0.000us 0.000us 2 aten::fill_ 0.00% 0.000us 0.00% 0.000us 0.000us 2 aten::empty_like 0.00% 0.000us 0.00% 0.000us 0.000us 1 aten::empty_strided 0.00% 0.000us 0.00% 0.000us 0.000us 3 SumBackward0 0.00% 0.000us 5.56% 1.000us 1.000us 1 autograd::engine::evaluate_function: AddBackward0 0.00% 0.000us 0.00% 0.000us 0.000us 1 AddBackward0 0.00% 0.000us 0.00% 0.000us 0.000us 1 aten::new_empty_strided 0.00% 0.000us 0.00% 0.000us 0.000us 2 ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ Self CPU time total: 18.000us ``` When it happens, the two `aten::sum` calls have different inputs: ``` aten::sum 4.35% 1.000us 13.04% 3.000us 3.000us 1 [[10, 10], []] aten::sum 8.70% 2.000us 8.70% 2.000us 2.000us 1 [[10, 10], [], [], []] ``` I'm not sure what is the internal difference between `z.sum()` and `z.sum(dim=None)` here on MacOS, I thought they are the same. ### Testing `pytest test_autograd.py -k test_profiler_seq_nr --verbose --flake-finder` to run the test 50 times, all pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91019 Approved by: https://github.com/malfet	2022-12-22 17:37:45 +00:00
soulitzer	d19988093d	[autograd Function] Return input as-is if marked dirty even when requires_grad=False (#91214 ) Fixes https://github.com/pytorch/pytorch/issues/90209 Somewhat related: https://github.com/pytorch/pytorch/issues/71119 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91214 Approved by: https://github.com/albanD	2022-12-21 21:20:56 +00:00
soulitzer	b66862ba87	[autograd Function] Don't materialize forward grad for non-differentiable types (#91183 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91183 Approved by: https://github.com/zou3519	2022-12-21 05:05:44 +00:00
albanD	0eb45d546c	Bind autograd current Node for debugging purposes (#90867 ) This allows to know at any point during the backward pass what is running and where the Node currently running was created at: ```python import torch from torch.utils._python_dispatch import TorchDispatchMode from torch.autograd import detect_anomaly class MyMode(TorchDispatchMode): def __torch_dispatch__(self, func, types, args, kwargs=None): node = torch._C._current_autograd_node() print(f"Running {func} from within {node}") if node is not None: print("The Node was created at:") print("\n ".join(node.metadata["traceback_"])) return func(args, *kwargs or {}) with MyMode(), detect_anomaly(): print("FW") a = torch.rand(10, requires_grad=True) b = a.mul(2) b = b.div(3) b = b.sum() print("BW") b.backward() ``` Gives ``` $ python foo.py foo.py:15: UserWarning: Anomaly Detection has been enabled. This mode will increase the runtime and should only be enabled for debugging. with MyMode(), detect_anomaly(): FW Running aten.rand.default from within None Running aten.mul.Tensor from within None Running aten.div.Tensor from within None Running aten.sum.default from within None BW Running aten.ones_like.default from within None Running aten.expand.default from within <SumBackward0 object at 0x7fa40c0c6dc0> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten.isnan.default from within <SumBackward0 object at 0x7fa40c0c6500> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten.any.default from within <SumBackward0 object at 0x7fa32b23a780> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten._local_scalar_dense.default from within <SumBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 20, in <module> b = b.sum() Running aten.div.Tensor from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten.isnan.default from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten.any.default from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten._local_scalar_dense.default from within <DivBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 19, in <module> b = b.div(3) Running aten.mul.Tensor from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.isnan.default from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.any.default from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten._local_scalar_dense.default from within <MulBackward0 object at 0x7fa40c0c9190> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.detach.default from within <AccumulateGrad object at 0x7fa40c0c9730> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) Running aten.detach.default from within <AccumulateGrad object at 0x7fa40c0c94b0> The Node was created at: File "foo.py", line 18, in <module> b = a.mul(2) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/90867 Approved by: https://github.com/soulitzer	2022-12-20 13:41:43 +00:00
Nikita Vedeneev	3870a9e28d	to_sparse_XXX: backward support (#90281 ) As per title. Fixes https://github.com/pytorch/pytorch/issues/85226 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90281 Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer	2022-12-14 09:05:17 +00:00
Pearu Peterson	f4099af1e9	Fix gradcheck for BSR and BSC inputs. (#90719 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90719 Approved by: https://github.com/soulitzer, https://github.com/cpuhrsch	2022-12-14 05:37:05 +00:00
soulitzer	6d425a7ce9	Fix forward AD custom Function non-differentiable outputs (#90787 ) Fixes https://github.com/pytorch/pytorch/issues/90067 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90787 Approved by: https://github.com/albanD	2022-12-13 23:13:44 +00:00
Richard Zou	24c3ad7851	Move private forward grad mode helpers to torch.autograd.forward_ad (#90240 ) Motivation - These were previously defined in functorch. They are not functorch-specific, so I'm moving them to torch.autograd.forward_ad and the autograd python bindings. - I need this to avoid some of my cyclic import problems. Should these be public APIs? Probably. Though this needs discussion, so punting it to the future. Test Plan: - moved the tests of these from test/functorch/test_eager_transforms.py to test/test_autograd.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/90240 Approved by: https://github.com/soulitzer	2022-12-13 14:14:02 +00:00
Richard Zou	eb314f9b1a	Add setup_context staticmethod to autograd.Function (#89859 ) Adds a setup_context staticmethod to autograd.Function. If it exists, then the user splits the ctx-specific logic from the forward() and puts it in the setup_context staticmethod. Docs will come later when we remove the feature flag. Test Plan: - some light tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/89859 Approved by: https://github.com/soulitzer	2022-12-08 19:31:04 +00:00
Richard Zou	103be1f164	Add feature flag for the autograd.Function extension (#89858 ) This PR adds a private runtime feature flag for the feature work we're going to do with extending autograd.Function. The motivation of the feature flag is: - to guard the feature against unsuspecting users - control the release of the feature to when we are ready to release it We might not even need the feature flag (because we hope to have the work done in the next month), but it is good practice and it does touch currently public API (autograd.Function). Concretely, "autograd.Function extension" refers to: - adding an optional `setup_context` staticmethod to autograd.Function - adding an optional `vmap` staticmethod to autograd.Function - autograd.Function support for functorch Test Plan: - new test that the feature flag works Pull Request resolved: https://github.com/pytorch/pytorch/pull/89858 Approved by: https://github.com/soulitzer	2022-12-08 19:31:01 +00:00
Sergii Dymchenko	6a7659f304	Fix issue 38095 TODO in test_autograd.py (#90031 ) Fix TODO related to https://github.com/pytorch/pytorch/issues/38095 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90031 Approved by: https://github.com/clee2000	2022-12-07 19:09:43 +00:00
Ram Rachum	351d73b97f	Fix exception causes all over the codebase (#90271 ) This is the continuation to #90134 and hopefully the final PR in this series. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271 Approved by: https://github.com/kit1980	2022-12-07 04:29:00 +00:00
PyTorch MergeBot	cba96366a2	Revert "remove torch.equal usages (#89527 )" This reverts commit `4095ef8b80`. Reverted https://github.com/pytorch/pytorch/pull/89527 on behalf of https://github.com/clee2000 due to broke periodic multigpu tests `4095ef8b80` https://github.com/pytorch/pytorch/actions/runs/3592806602/jobs/6049368502	2022-12-02 21:36:13 +00:00
Pearu Peterson	b87682f555	Fix gradcheck for CSR and CSC inputs. (#89786 ) Partially fix-es https://github.com/pytorch/pytorch/issues/87085 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89786 Approved by: https://github.com/albanD	2022-12-02 12:35:20 +00:00
Philip Meier	4095ef8b80	remove torch.equal usages (#89527 ) Preparation for the next PR in this stack: #89559. I replaced - `self.assertTrue(torch.equal(...))` with `self.assertEqual(..., rtol=0, atol=0, exact_device=True)`, - the same for `self.assertFalse(...)` with `self.assertNotEqual(...)`, and - `assert torch.equal(...)` with `torch.testing.assert_close(..., rtol=0, atol=0)` (note that we don't need to set `check_device=True` here since that is the default). There were a few instances where the result of `torch.equal` is used directly. In that cases I've replaced with `(... == ...).all().item()` while sometimes also dropping the `.item()` depending on the context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89527 Approved by: https://github.com/mruberry	2022-12-01 11:22:52 +00:00
albanD	02e2eaa9c6	Fix CopySlices logic to ensure wrapped node runs properly. (#89812 ) This should remove the failures seen by https://github.com/pytorch/pytorch/pull/89720 in functionalization Locally verified that running the following on top of this PR does pass: `python benchmarks/dynamo/huggingface.py --accuracy --backend aot_eager --training --only MobileBertForMaskedLM` Pull Request resolved: https://github.com/pytorch/pytorch/pull/89812 Approved by: https://github.com/soumith, https://github.com/voznesenskym, https://github.com/ezyang	2022-11-29 18:44:28 +00:00
albanD	c79489c8e6	Expose to python the backward AD view_func (#89586 ) This will be useful for other systems (AOTAutograd) that want to replay autograd views. FYI @bdhirsh Pull Request resolved: https://github.com/pytorch/pytorch/pull/89586 Approved by: https://github.com/soulitzer	2022-11-24 03:39:58 +00:00
albanD	347a7d97a5	Deprecate decorating classes with torch.no_grad and similar (#89522 ) Fixes https://github.com/pytorch/pytorch/issues/89450 I would have completely removed it but I don't think this is particularly urgent and there are some use of it in the wild: https://github.com/search?q=%2Ftorch%5C.no_grad%5C%28%5C%29%5Cnclass%2F&type=code So we might as well take one release to do it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89522 Approved by: https://github.com/lezcano, https://github.com/soulitzer, https://github.com/janeyx99	2022-11-23 16:51:42 +00:00
soulitzer	6b521bbf35	Prevent module full_backward_hook from erroring in double backward (#88357 ) Also clarifies documentation to say "execute if and only if gradients wrt outputs are computed" (previously, "execute every time gradients wrt inputs are computed") See https://docs.google.com/document/d/1tFZKYdsSzRBJ7Di7SWt8X8fSg-E3eiUPwomMF10UyhM/edit for more details regarding the question: 'should module full_backward_hooks be called every time the gradients wrt module inputs are called, or should module full_backward_hooks only be called when the "backward for the module" have been computed?' Fixes https://github.com/pytorch/pytorch/issues/88312 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88357 Approved by: https://github.com/albanD	2022-11-16 19:27:30 +00:00
soulitzer	27dc03e09b	Turn internal assert when saved tensor is detached inplace into torch check (#88860 ) Fixes https://github.com/pytorch/pytorch/issues/88809 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88860 Approved by: https://github.com/albanD	2022-11-12 18:33:18 +00:00
soulitzer	b92acee8f8	Add context manager to allow mutation on saved tensors (#79056 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79056 Approved by: https://github.com/albanD	2022-11-11 15:18:28 +00:00
Fabio Rocha	652af5ec15	upsample_*.vec ops are now CompositeImplicit (#85638 ) It was previously CompositeExplicit but it was not really necessary. See discussion in https://github.com/pytorch/pytorch/issues/85405 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85638 Approved by: https://github.com/ezyang, https://github.com/lezcano, https://github.com/malfet, https://github.com/jansel	2022-11-09 09:58:04 +00:00
Kurt Mohler	ee28b865ee	Deprecate TypedStorage, its derived classes, and all of their public methods (#85303 ) Part of #85302 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303 Approved by: https://github.com/ezyang	2022-11-08 18:11:01 +00:00
soulitzer	84a302e534	Remove wrong internal assert in handle_view_on_rebase (#88243 ) Fixes: https://github.com/pytorch/pytorch/issues/88205 The `CreationMeta::NO_GRAD_MODE` path in handle_view_on_rebase wrongly assumes that the tensor would be a leaf, because tensors created in no_grad are always leaf tensors. However, due to creation_meta propagation, a view of a view created in no_grad also has `CreationMeta::NO_GRAD_MODE`, but DOES have grad_fn. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88243 Approved by: https://github.com/albanD	2022-11-02 17:50:16 +00:00
Peter Bell	bc9caafc78	record_function: update to use custom_class API (#76420 ) Re-submit of gh-72302 This still has a small performance hit, but it much smaller. On my machine I see `_record_fucntion_exit._RecordFunction` takes 1.05 us compared to the `Tensor` overload taking 0.79 us. In an overall comparison, I see a 0.7 us slowdown from 6.0 us to 6.7 us for this timeit benchmark ```python import torch def foo(): with torch.profiler.record_function("foo"): return torch.eye(3) %timeit foo() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76420 Approved by: https://github.com/robieta	2022-11-02 00:39:28 +00:00
soulitzer	6ad3543a1b	BE: Improve test_will_engine_execute_node unittest (#87806 ) Adds the test from https://github.com/pytorch/pytorch/pull/86672 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87806 Approved by: https://github.com/albanD	2022-10-27 21:13:08 +00:00
soulitzer	adb76ef510	Expose API for backward execution order (#87507 ) In this PR: - graph_task stores graph roots on construction so that we can later traverse through the graph - before the nodes are returned, they needed to be converted from raw_ptr to shared_ptr, and this should be OK because the graph is guaranteed to be alive Pull Request resolved: https://github.com/pytorch/pytorch/pull/87507 Approved by: https://github.com/albanD	2022-10-26 21:28:45 +00:00
lezcano	faf9c47abb	Simplify a few diagonal-related functions (#87180 ) `diag` was unnecessarily implemented as a kernel rather than as a composite function, which made it unnecessarily difficult (explicit backward + all it entails). We also change a few uses of `diag` on 2D tensors for `diagonal()`. The latter returns a view rather than creating a new tensor. We also upgrade its meta implementation to a fully-fledged decomposition I tried implementing the backwards of `diagonal()` via `diag_scatter` (or better `diag_scatter_` to keep the perf) but functionalisation was failing and I was not sure how to fix this, so I moved on. It may be possible to simplify that one as well if @soulitzer or someone knows how to do this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87180 Approved by: https://github.com/ngimel, https://github.com/albanD, https://github.com/mruberry	2022-10-24 06:11:53 +00:00
soulitzer	c18eead2df	Update saved variable hooks to no longer trigger on wrapped numbers (#87316 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87316 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-10-20 03:01:11 +00:00
Brian Hirsh	34c86adec4	symintify all of derivatives.yaml (#86610 ) Big-bang PR to symintify all .sizes() calls in derivatives.yaml, which will be needed for symbolic tracing. * with the exception of `split()`, which is tougher to land because it requires internal changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86610 Approved by: https://github.com/albanD	2022-10-14 20:15:48 +00:00
albanD	55663b7f81	Reland 3 of Symintify getitem and add the required helper functions (#86207 ) (#86487 ) Note that this might not cover every use of the function (we know it doesn't) But this is enough to get few models passing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86487 Approved by: https://github.com/ezyang	2022-10-10 15:54:28 +00:00
soulitzer	ba3fde6aa0	Add multi-grad hooks (#86260 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86260 Approved by: https://github.com/albanD	2022-10-07 21:16:45 +00:00
albanD	97e56c176d	Try to fix shutdown test in edge cases (#86464 ) Fixes https://github.com/pytorch/pytorch/issues/85259 See the issue for debugging details. tl;dr: when a worker thread is actually used, make sure it is initialized before exiting. Yes, it is very unlikely it will take >10s to initialize but it is what seems to happen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86464 Approved by: https://github.com/soulitzer, https://github.com/ezyang	2022-10-07 21:09:40 +00:00
PyTorch MergeBot	5b69b87d5a	Revert "Symintify getitem and add the required helper functions (#86207 )" This reverts commit `fd5085c445`. Reverted https://github.com/pytorch/pytorch/pull/86207 on behalf of https://github.com/seemethere due to Fails internal tests, see: https://www.internalfb.com/intern/sandcastle/job/22517998926071860/insights	2022-10-07 16:10:30 +00:00
Pearu Peterson	8f2c2167d4	Support autograd on sparse_mm in full. (#86301 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86301 Approved by: https://github.com/cpuhrsch	2022-10-06 18:39:31 +00:00
albanD	fd5085c445	Symintify getitem and add the required helper functions (#86207 ) Note that this might not cover every use of the function (we know it doesn't) But this is enough to get few models passing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86207 Approved by: https://github.com/ezyang, https://github.com/Chillee, https://github.com/bdhirsh	2022-10-06 04:46:19 +00:00
Elias Ellison	d04889323e	Add Context Manager for Disabling Multithreading in Backwards, use in aot autograd (#86245 ) We were running into a few issues with running multithreaded backwards in aot_autograd: such as https://github.com/pytorch/pytorch/issues/86136, and `FakeTensorMode` getting into a weird state as a result of not executing functions completely sequentially. The multithreaded backwards is lost in translation when we trace out the backwards anyway, and adds a lot of additional complexity. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86245 Approved by: https://github.com/albanD, https://github.com/yf225	2022-10-06 03:27:42 +00:00
PyTorch MergeBot	168ba066e3	Revert "Symintify getitem and add the required helper functions (#86207 )" This reverts commit `17addb307e`. Reverted https://github.com/pytorch/pytorch/pull/86207 on behalf of https://github.com/malfet due to Broke lint, by double-registering `meta_index_put`, but no CI was run during the outage	2022-10-05 22:42:56 +00:00
albanD	17addb307e	Symintify getitem and add the required helper functions (#86207 ) Note that this might not cover every use of the function (we know it doesn't) But this is enough to get few models passing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86207 Approved by: https://github.com/ezyang	2022-10-05 21:19:00 +00:00
Jing Xu	f20e4eab7b	Fix ITT unit-tests if PyTorch is compiled with `USE_ITT=OFF` (#86199 ) Fixes https://github.com/pytorch/pytorch/pull/84848#discussion_r986329680 @malfet @slgong-fb Pull Request resolved: https://github.com/pytorch/pytorch/pull/86199 Approved by: https://github.com/malfet	2022-10-04 21:57:05 +00:00
Richard Zou	a262ccea58	Change torch.autograd.graph.disable_saved_tensors_hooks to be public API (#85994 ) Also addresses some comments from the review in https://github.com/pytorch/pytorch/pull/85971 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85994 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-10-03 16:25:01 +00:00
Richard Zou	7c72bc48d8	Add mechanism to disable the "saved tensors hooks" feature (#85971 ) The rationale for this is that functorch doesn't work with saved variable hooks at the moment or checkpointing and we need some way to disable it. Concretely: - there's a context manager that does the disabling - this feature is disabled on a thread-local basis - one can set an error message or use the default error message that says the feature has been disabled Since it is thread local I needed to update ATen/ThreadLocalState. To make things nicer, this PR refactors all the "saved tensors hooks" related TLS things into a single struct. Test Plan: - new test Differential Revision: [D39970936](https://our.internmc.facebook.com/intern/diff/D39970936) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85971 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-09-30 20:03:58 +00:00
PyTorch MergeBot	801818f9e6	Revert "Add mechanism to disable the "saved tensors hooks" feature (#85553 )" This reverts commit `5aa183d2bc`. Reverted https://github.com/pytorch/pytorch/pull/85553 on behalf of https://github.com/atalman due to Reverting since failed build-fisp-diff-linux_platform010-opt	2022-09-30 14:31:09 +00:00
Richard Zou	5aa183d2bc	Add mechanism to disable the "saved tensors hooks" feature (#85553 ) The rationale for this is that functorch doesn't work with saved variable hooks at the moment or checkpointing and we need some way to disable it. Concretely: - there's a context manager that does the disabling - this feature is disabled on a thread-local basis - one can set an error message or use the default error message that says the feature has been disabled Since it is thread local I needed to update ATen/ThreadLocalState. To make things nicer, this PR refactors all the "saved tensors hooks" related TLS things into a single struct. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/85553 Approved by: https://github.com/soulitzer	2022-09-28 22:49:28 +00:00
Mikayla Gawarecki	afaee00fec	Add python `nested_tensor` and `as_nested_tensor` constructors in `torch.nested` (#85593 ) Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ). Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc. Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593 Approved by: https://github.com/drisspg, https://github.com/cpuhrsch	2022-09-28 20:15:02 +00:00
soulitzer	a876432aea	Expose torch._will_engine_execute_node (#84773 ) Addresses: https://github.com/pytorch/pytorch/issues/83617 This PR a way to query the TLS graph task's exec_info which is a map mapping the Node to a bool indicating whether it will be executed in the current backward pass (as determined by the inputs= argument for .grad of .backward). - this works with both custom Function nodes and normal codegened nodes - to be able to verify whether the pyobject passed is an actual node, we now store pointers to PyTypeObjects into a set on registration. - error out when .backward without inputs= to avoid silently returning True Alternatives: - not sure if it is possible to bind to Python from a raw pointer to Node. At least we wouldn't be able to use existing logic, and the Python object should only hold a weak reference to the Node. - other solutions to the motivating issue seem to require more extensive modification to the engine See the issue linked for an example of usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/84773 Approved by: https://github.com/albanD	2022-09-28 20:13:52 +00:00
Jing Xu	80b8886223	add itt unit test and docstrings (#84848 ) Add unit tests and docstrings corresponding to PR https://github.com/pytorch/pytorch/pull/63289 UT: 1. `test_profiler_emit_itt` in `test/test_autograd.py`. This test is merely intended to catch if emit_itt breaks on construction. 2. Test `torch.profiler.itt` functions in `test/test_itt.py` 3. Only testing that emit_itt runs when `record_shapes` option is enabled in `test/test_profiler.py`. Docstring: 1. add ITT related info into `docs/source/bottleneck.rst` 4. add `torch.profiler.itt` functions to `docs/source/profiler.rst` 5. add docstring to `torch.profiler.itt` functions in `torch/profiler/itt.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84848 Approved by: https://github.com/malfet	2022-09-28 01:39:58 +00:00
Thomas Viehmann	e41d758e26	Handle implicit real->complex casting for backward of stack (#84993 ) Fixes: #75852 P.S.: Yay for the PyTorch foundation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84993 Approved by: https://github.com/soulitzer	2022-09-19 21:20:34 +00:00
Ivan Yashchuk	01c54ad6de	Remove deprecated torch.eig (#70982 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982 Approved by: https://github.com/Lezcano, https://github.com/malfet	2022-09-09 21:31:57 +00:00
Sergii Dymchenko	591222f5d9	Fix use-dict-literal lint (#83718 ) Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718 Approved by: https://github.com/albanD	2022-08-24 00:26:46 +00:00
soulitzer	81843596cb	Fix view_func replay in no-grad mode (#83872 ) Fixes https://github.com/pytorch/pytorch/issues/83828 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83872 Approved by: https://github.com/albanD	2022-08-23 18:13:00 +00:00
Brian Hirsh	0c24af4985	Always allow tensor metadata changes (#83590 ) Make it so that it is valid to set metadata after detach calls, like `x.detach().resize_(...)`. This technically lifts some restrictions around `.data`. This PR means that you can now technically call `x.data.resize_(...)`, which can now directly resize `x` instead of erroring. My understanding: Before the tensor-variable merge, when `x` and `x.data` were really different tensors, you could resize `x.data` independently of `x`, and during the merge, this error was added to avoid silent confusing behavior changes. It was agreed that this error has been around long enough (several years) that it's acceptable to drop. cc @albanD @ezyang. (Ed already had a prototype PR [here](https://github.com/pytorch/pytorch/pull/83545) - I ended up making one to try to slog through test failures). Pull Request resolved: https://github.com/pytorch/pytorch/pull/83590 Approved by: https://github.com/ezyang	2022-08-19 23:30:43 +00:00
Mikayla Gawarecki	bd0ad7a84f	Add backward support for rudimentary NestedTensor.sum(dim) (#82625 ) Per offline discussion, this will be updated to use expand once expand semantics for nested tensor have been fleshed out. Next steps will be to add support for other features for forward sum mentioned on #82387 and likewise update the backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/82625 Approved by: https://github.com/albanD	2022-08-17 18:12:00 +00:00
soulitzer	31fad3926a	Add option to run anomaly mode without nan checking (#83481 ) Fixes https://github.com/pytorch/pytorch/issues/83117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83481 Approved by: https://github.com/albanD	2022-08-16 22:56:23 +00:00
soulitzer	b567742038	Add ability to register prehooks to grad_fn (#83226 ) This simply replicates the implementation of PyFunctionPostHooks Fixes https://github.com/pytorch/pytorch/issues/83120 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83226 Approved by: https://github.com/albanD	2022-08-13 00:05:07 +00:00
Mikayla Gawarecki	e3e33cfae0	Enable codegen of per-dispatch key derivative formulas in derivatives.yaml (#82801 ) `derivatives.yaml` can now take a `dispatch` entry which registers per-autograd dispatch key derivatives such as ``` name: foo(Tensor self, Tensor y) -> Tensor dispatch: Default: x: grad y: grad.expand(y.sizes()) AutogradNestedTensor: x: grad y: NestedTensor_foo_backward(grad, y) output_differentiabilty: [True] ``` However the old schema where there is no `dispatch` entry is still supported. Would greatly appreciate feedback on how to improve the testing strategy of this PR, currently have registered an aten test op in TestOps.cpp with dummy gradients in derivatives.yaml and have some tests in test_autograd.py:TestAutogradMultipleDispatch but I am not sure whether these are sufficiently rigorous. Additionally, this PR also makes the assumption that sets like [VIEW_FUNCTIONS](`ff5399e528/tools/autograd/gen_inplace_or_view_type.py (L60)`) are per-native-function and not per-native-function-and-dispatch-key. I'm not sure whether this is necessarily the case, would there ever be a situation where (e.g. a nested_tensor op is a view op but the aten function is not or vice versa?) * __->__ #82801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82801 Approved by: https://github.com/bhosmer, https://github.com/albanD	2022-08-10 19:26:29 +00:00
Jeff Daily	263c05c918	[ROCm] work-around missing hipProfilerStart/Stop (#82778 ) ### Description cudaProfilerStart and cudaProfilerStop are deprecated but exposed by torch.cuda.cudart(). HIP has corresponding functions stubbed out, hipProfilerStart and hipProfilerStop, but they return hipErrorNotSupported. Profiling in HIP is supported, but not via these deprecated APIs. See https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__PROFILER__DEPRECATED.html. These functions are indirectly used by one or more unit tests that would otherwise pass if the non-functional HIP APIs were replaced with a dummy function. ### Testing Unskipped a related unit test, run by ciflow/trunk. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82778 Approved by: https://github.com/ezyang	2022-08-08 18:25:13 +00:00
albanD	7dd795cbed	Prevent ref cycle creation in inner hook (#82776 ) Towards fixing https://github.com/pytorch/pytorch/issues/82482 This PR fixes two things: ## 1) memory leak The .detach() call prevents a true memory leak in some cases where the user function is using multiple ops in a row that save their inputs. The following chain of objects keep each other alive - the `storage` object - a recomputed Tensor y - y's grad_fn FooBackward (in c++) - FooBackward's SavedVariables (in c++) - SavedVariable Hook - the `inner_pack` function - captures `storage` Since part of this cycle is in c++, the python gc is not able to break it. Should THPCppFunction_traverse actually visit it's SavedVariables which in turn should visit their hooks? I think the answer is yes but I haven't dived into which python object is traversing what as if there is non-unique ownership of the c++ object, it makes the traversal a lot trickier. @ezyang do you think we should dive into this more? In this case, this can be easily solved anyways by storing `y.detach()` in the `storage` object as we don't care about the temporary backward graph that gets created during the second forward call. ## 2) Lifetime of the recomputed buffers The new storage system is now such that the lifetime of the recomputed buffer is directly linked to the SavedVariable c++ object. Meaning that this buffer will get deleted IIF the SavedVariable is cleared. This means that we now get the exact same behavior as the version without the saved variable hook where Tensors are saved directly on the SavedVariable object. This is great as this solves all the cases where the non-checkpoint version used to work but the checkpoint version does not (even double access or retain_graph=True). The one drawback of this approach though is that the buffer do NOT get cleared when the user passes in `retain_graph=True`! The next backward won't even re-run the forward as it already has all the buffers available. Is this a problem that you think we would need to find a solution for @rohan-varma or it is niche enough that we don't care for now? Pull Request resolved: https://github.com/pytorch/pytorch/pull/82776 Approved by: https://github.com/ezyang, https://github.com/rohan-varma	2022-08-06 00:31:22 +00:00
soulitzer	1cafb1027f	Fix leak when create_graph and full backward hook registered (#82788 ) Fixes #82528 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82788 Approved by: https://github.com/albanD	2022-08-05 15:35:36 +00:00
Rohan Varma	98cad3d305	[Checkpoint] Fix autocasting (#81766 ) Add support for the correct autocasting in the non-reentrant checkpoint as it exists in the reentrant-version. This was noticed by @awgu. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81766 Approved by: https://github.com/albanD	2022-07-22 21:33:56 +00:00
soulitzer	f69768fed4	[forward ad] Fix codegen to ignore undefined outputs (#81114 ) I don't think there's a way to avoid functions returning undefined tensors as outputs, so codegen will have to detect them before calling _set_fw_grad. Alternatively, we can just make calling _set_fw_grad with undefined self a no-op, but I'm biasing toward keeping _set_fw_grad more strict in case it is called in other areas. Fixes https://github.com/pytorch/pytorch/issues/81111 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81114 Approved by: https://github.com/albanD	2022-07-11 15:01:39 +00:00
soulitzer	b69a2546f4	[forward ad] Skip some metadata checks for 0 numel tensor (#81055 ) Fixes https://github.com/pytorch/pytorch/issues/80507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81055 Approved by: https://github.com/ngimel	2022-07-11 15:01:39 +00:00
Rohan Varma	e14941ef79	Add kwarg support for no_reentrant checkpoint (#80987 ) Supports kwargs input to function when `torch.utils.checkpoint` with use_reentrant=False. This is required to unblock T5 activation checkpointing and MetaSeq use cases. Closes https://github.com/pytorch/pytorch/issues/79887 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80987 Approved by: https://github.com/zhaojuanmao	2022-07-09 05:07:13 +00:00
soulitzer	516f3198d6	Fix retains grad behavior after in-place (#79996 ) See this doc: https://docs.google.com/document/d/1KiRdnoj6B4cI3yl017hTbCqcOGO1gWIpUf20sldipHM/edit# Two issues (1) regarding hooks in general and (2) regarding retains grad hooks are fixed, Python hooks, which rely on a different mechanism are not discussed here: - Hooks in cpp in general - (fixed) new hooks to registered to a newer version of the tensor no longer get applied to grad_fn associated with older version of the tensor when the first hook was ever registered - (unchanged) hooks registered to the older version of the tensor remain active on - Retains grad hooks - (fixed) now get moved to the latest grad_fn. NB: To the user, retains_grad is not considered hooks or expected to behave like hooks (which we consider properties of the grad_fn) vs retains_gradness which is a property of the tensor. - (not in this PR) Python hooks - (will fix) same issue as hooks in cpp where new hooks are being applied to grad_fn associated with the older version of the tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/79996 Approved by: https://github.com/albanD	2022-07-08 19:13:28 +00:00
soulitzer	ea987086fc	Fix test_gradcheck_forward_ad_respects_requires_grad for slow gradcheck (#80401 ) Tested locally Pull Request resolved: https://github.com/pytorch/pytorch/pull/80401 Approved by: https://github.com/albanD	2022-06-28 13:51:44 +00:00
PyTorch MergeBot	a2d159e6e2	Fix forward AD copy_ into same-sized tensor without fw grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/79653 Approved by: https://github.com/albanD	2022-06-17 18:55:32 +00:00
drisspg	b9f83cb737	use is_same_size in autograd init (#79553 ) Broke: #79446 into a smaller commit that just adds is_same_size to the the autograd __init_file. This function is_same_size will be dispatched to the original behavior for regular tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/79553 Approved by: https://github.com/soulitzer	2022-06-15 19:49:42 +00:00
Rohan Varma	44fe851feb	[WIP] Fix non-reentrant hooks based checkpointing Pull Request resolved: https://github.com/pytorch/pytorch/pull/78752 Approved by: https://github.com/albanD	2022-06-14 01:13:33 +00:00
soulitzer	99ffeff949	[forward ad] Sync conj for between primal and tangent on set forward grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/78358 Approved by: https://github.com/Lezcano, https://github.com/zou3519	2022-06-08 04:20:17 +00:00
yuguo68	efdb4192bc	set data permits requires_grad=True on integer tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/78436 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-06-01 15:56:32 +00:00
soulitzer	c88367442d	[forward ad] forbid non-float non-complex tangent and primal Pull Request resolved: https://github.com/pytorch/pytorch/pull/78361 Approved by: https://github.com/albanD	2022-05-31 20:58:19 +00:00
Elias Ellison	678213ead2	Fake Tensor Part 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77969 Approved by: https://github.com/ezyang	2022-05-31 16:20:35 +00:00
Taylor Robie	e17f14fab2	[Profiler] Propagate metadata into `Engine::evaluate_function` event. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77696 https://github.com/pytorch/pytorch/pull/63619 added a RECORD_FUNCTION guard to make calls to `Engine::evaluate_function` visible regardless of the underlying op. While useful, this creates a call that looks like a forward call that somewhat complicates stitching forward and backward ops. I don't want to add complexity (and therefore work) on the hot path; instead it's fairly straightforward to stitch things back together in post. This PR simply propagates sequence number and forward tid info up to the `evaluate_function` event. Differential Revision: [D36302562](https://our.internmc.facebook.com/intern/diff/D36302562/) Approved by: https://github.com/aaronenyeshi	2022-05-22 22:39:13 +00:00
Alban Desmaison	090eddf1c7	Fix MPS interaction with autograd engine Pull Request resolved: https://github.com/pytorch/pytorch/pull/77644 Approved by: https://github.com/kulinseth, https://github.com/soulitzer, https://github.com/seemethere	2022-05-17 21:26:16 +00:00
Mikayla Gawarecki	7ba4e124e6	Bugfix gradient formula for index_reduce('prod') + separate out sample_inputs for index_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/77382 Approved by: https://github.com/cpuhrsch	2022-05-16 18:43:57 +00:00
soulitzer	beb405035c	Update forward AD metadata check to skip stride check when size is 0 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77269 Approved by: https://github.com/albanD	2022-05-16 15:53:17 +00:00
Kulin Seth	e011a8e18b	Enable PyTorch operations on MPS Backend. (#77343 ) Add PyTorch operations to MPS backend. - https://github.com/pytorch/pytorch/issues/77394 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343 Approved by: https://github.com/albanD	2022-05-13 18:28:53 +00:00
Mikayla Gawarecki	465e0ae266	Bugfix scatter_reduce backward formulas Pull Request resolved: https://github.com/pytorch/pytorch/pull/76523 Approved by: https://github.com/albanD	2022-05-05 20:22:39 +00:00
Xiaodong Wang	2291960d3f	Back out "record_function: update to use custom_class API" (#76253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76253 We're observing large QPS regression on the original PR https://github.com/pytorch/pytorch/pull/72302. For the training job we had, it regressed from 720k QPS to 450k QPS (see the test plan in FB internal). We suspect this is because the api was changed from `_record_function_enter` to `_record_function_enter_new`, and we're running experiments to confirm that. Will add more details when the runs in the test plan has finished. For now, it's better to revert the diff to unblock internal usecases and we can think about how to reland this diff later. Original commit changeset: dc9939f1fa6d Original Phabricator Diff: D35257354 Test Plan: on trunk: f338665947 with this diff: f338502850 Reviewed By: malfet, robieta Differential Revision: D35853300 fbshipit-source-id: dd38042aeacb848f66756491a4c849c7c652a0e1	2022-04-26 17:49:57 -04:00
Alban Desmaison	eb69e8a3ed	Revert "Revert "record_function: update to use custom_class API"" This reverts commit `3f9f35b9f8`. This should be done via a clean revert as this has been in master for a long time. Doing a quick fix here to make sure we don't break master. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76172 Approved by: https://github.com/atalman	2022-04-21 14:18:28 +00:00
PyTorch MergeBot	3f9f35b9f8	Revert "record_function: update to use custom_class API" This reverts commit `5630c5ac75`. Reverted https://github.com/pytorch/pytorch/pull/72302 on behalf of https://github.com/atalman	2022-04-21 13:59:48 +00:00
albanD	cd0591dff3	Change default TLS behavior in dispatch to favor is-a style Pull Request resolved: https://github.com/pytorch/pytorch/pull/75827 Approved by: https://github.com/ezyang	2022-04-20 17:32:29 +00:00
Peter Bell	cc56fac213	Fix complex to real casting warning in _to_copy backward Fixes #75781 A Real->Complex cast should result in a gradient with no imaginary component, so discarding the imaginary component is expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75805 Approved by: https://github.com/albanD	2022-04-19 14:04:13 +00:00
Ivan Yashchuk	38a758e251	Add forward AD for rsub, polar, and FFT This PR adds forward AD support for: - torch.rsub - tensor.\_\_rsub\_\_ - torch.polar - torch.fft.fft - torch.fft.fft2 - torch.fft.fftn - torch.fft.hfft - torch.fft.hfft2 - torch.fft.hfftn - torch.fft.rfft - torch.fft.rfft2 - torch.fft.rfftn - torch.fft.ifft - torch.fft.ifft2 - torch.fft.ifftn - torch.fft.ihfft - torch.fft.ihfft2 - torch.fft.ihfftn - torch.fft.irfft - torch.fft.irfft2 - torch.fft.irfftn - torch.stft - torch.istft Ref. https://github.com/pytorch/pytorch/issues/71117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75326 Approved by: https://github.com/soulitzer	2022-04-08 05:01:01 +00:00
Ivan Yashchuk	65ed1e3526	Add forward AD for torch.atan2 This PR adds a formula for the total differential of the atan2 function. Ref. https://github.com/pytorch/pytorch/issues/71117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75027 Approved by: https://github.com/soulitzer	2022-04-01 05:24:19 +00:00
Nikita Shulga	bfac65dfe5	[testing] Update dispatch macros (#74977 ) This PR is reland of #74289 Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>	2022-03-30 14:13:21 -07:00
PyTorch MergeBot	2e4152b118	Revert "[testing] Update dispatch macros" This reverts commit `eed19a0f38`. Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet	2022-03-30 19:52:37 +00:00
Khushi Agrawal	eed19a0f38	[testing] Update dispatch macros Hi, This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that). Please take a look. Thanks! cc: @pmeier @mruberry @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289 Approved by: https://github.com/pmeier, https://github.com/mruberry	2022-03-30 16:10:16 +00:00
Peter Bell	5630c5ac75	record_function: update to use custom_class API Merge after forward-compatibility period is over. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72302 Approved by: https://github.com/albanD	2022-03-30 15:57:28 +00:00
Kurt Mohler	5375b2e994	Resolve `int[]?` arguments to new OptionalIntArrayRef class This PR uses the `OptionalArrayRef` template class that was drafted in #64084. Fixes #44409 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70864 Approved by: https://github.com/ezyang	2022-03-26 01:45:50 +00:00
Richard Zou	a75c718d7c	[reland] Update tls logic to work better with guarded call (#73925 ) This PR relands https://github.com/pytorch/pytorch/pull/73925 which we reverted due to a large breakage in functorch. As a part of the reland, this PR adds a change we agreed upon in https://docs.google.com/document/d/1i7Y9VZp9PxtgVcrQh6nGQXkXkPc1uMep0dM-OMOGJ9o/edit The change is moving the PythonTLSSnapshot key after DynamicLayerFrontMode. Test Plan: - I tested this with an updated version of functorch and all the tests pass so I think we are out of the woods. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74577 Approved by: https://github.com/albanD	2022-03-25 19:51:10 +00:00
Richard Zou	a9d9f91f31	Revert "Update tls logic to work better with guarded call (#73925 )" This reverts commit `dff02851d1`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74268 Approved by: https://github.com/albanD	2022-03-16 17:00:19 +00:00
Nikita Shulga	ef066f0832	Revert D34856571: [pytorch][PR] Replace `get_all_` type macros with the ATen dispatch macros. Test Plan: revert-hammer Differential Revision: D34856571 (`3ded7b1da3`) Original commit changeset: 0dca038bcad5 Original Phabricator Diff: D34856571 (`3ded7b1da3`) fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386 (cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)	2022-03-15 22:07:11 +00:00
Khushi Agrawal	3ded7b1da3	Replace `get_all_` type macros with the ATen dispatch macros. (#71561 ) Summary: Hi, Team! The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros. The files it iterates over are: (Thanks, Lezcano, for the idea!!) <details> <summary> `test/test_autograd.py`</summary> <p> ```python 43:from torch.testing._internal.common_dtype import get_all_dtypes 8506: floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point] ``` </p> </details> <details> <summary> `test/test_binary_ufuncs.py`</summary> <p> ```python 26: all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes, 27: get_all_complex_dtypes, get_all_fp_dtypes, 935: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1035: dtypes(get_all_dtypes( 1488: dtypes((get_all_dtypes(include_bool=False, include_bfloat16=False))) 1879: dtypes(product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False))) 1887: dtypes((get_all_int_dtypes() + [torch.bool])) 1913: dtypes((get_all_fp_dtypes())) 1941: dtypes((get_all_fp_dtypes())) 1977: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 2019: dtypes(product(get_all_fp_dtypes(), get_all_fp_dtypes())) 2048: dtypes(get_all_dtypes()) 2110: dtypes(product(get_all_dtypes(include_complex=False), 2111: get_all_dtypes(include_complex=False))) 2128: types = [torch.bool, torch.bfloat16] + get_all_int_dtypes() 2173: if dtypes[1] in get_all_fp_dtypes(): 2178: dtypes(product(get_all_fp_dtypes(), 2179: get_all_fp_dtypes())) 2260: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2261: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2273: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2274: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2307: dtypes(get_all_math_dtypes('cpu')) 2319: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 2331: dtypes(get_all_int_dtypes()) 2356: dtypes(get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False)) 2393: if dtype in get_all_int_dtypes(): 2614: dtypes(get_all_dtypes()) 2624: dtypes(tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2))) 2806: dtypes(list(product(get_all_dtypes(include_complex=False), 2807: get_all_dtypes(include_complex=False)))) 2866: dtypes(list(product(get_all_complex_dtypes(), 2867: get_all_complex_dtypes()))) 2902: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2906: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2910: dtypes(product(get_all_dtypes(), get_all_dtypes())) 3019: dtypes = [torch.float, torch.double] + get_all_complex_dtypes() 3221: dtypes(get_all_dtypes(include_complex=False)) 3407: dtypes(list(product(get_all_dtypes(include_bool=False), 3408: get_all_dtypes(include_bool=False)))) 3504: dtypes(product(get_all_dtypes(include_complex=False, include_bfloat16=False), 3505: get_all_dtypes(include_complex=False, include_bfloat16=False))) 3516: if x.dtype in get_all_int_dtypes() + [torch.bool]: 3643: dtypes(product(get_all_dtypes(include_complex=False, 3645: get_all_dtypes(include_complex=False, ``` </p> </details> <details> <summary> `test/test_complex.py`</summary> <p> ```python 6:from torch.testing._internal.common_dtype import get_all_complex_dtypes 11: dtypes(get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_foreach.py`</summary> <p> ```python 18: get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 142: if dtype in get_all_int_dtypes(): 179: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 201: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 205: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 211: disable_fastpath \|= dtype not in get_all_complex_dtypes() 241: bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 246: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 248: disable_fastpath \|= dtype not in get_all_complex_dtypes() 250: disable_fastpath \|= True and dtype not in get_all_complex_dtypes() 307: disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool] 365: if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes(): 376: ops(foreach_unary_op_db, dtypes=get_all_dtypes()) 393: dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False)) 401: ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True)) 426: if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes(): 439: dtypes(get_all_dtypes()) 449: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 481: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 536: if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div: 545: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 637: ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False)) ``` </p> </details> <details> <summary> `test/test_linalg.py`</summary> <p> ```python 29: all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, 30: get_all_fp_dtypes, 111: dtypes((get_all_dtypes())) 794: float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes() 807: dtypes((get_all_int_dtypes())) 828: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 841: if dtype in get_all_complex_dtypes(): 844: dtypes(itertools.product(get_all_dtypes(), 845: get_all_dtypes())) 855: for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3): 5607: get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater))) 5608: dtypes((set(get_all_dtypes()) - {torch.half, torch.bool})) 5644: dtypes((get_all_complex_dtypes() + get_all_fp_dtypes())) 6255: dtypesIfCUDA(get_all_complex_dtypes(), 6256: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)), 6292: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6323: dtypesIfCUDA(get_all_complex_dtypes(), 6324: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6325: dtypes(get_all_complex_dtypes(), get_all_fp_dtypes()) 6358: dtypesIfCUDA(([torch.float, torch.double] + get_all_complex_dtypes())) 6556: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6668: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6741: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_nn.py`</summary> <p> ```python 37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes 50: onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \ 8862: for device in get_all_device_types(): 9629: for dt1 in get_all_math_dtypes(device): 9630: for dt2 in get_all_math_dtypes(device): 9631: for dt3 in get_all_math_dtypes(device): 9648: for input_dtype in get_all_math_dtypes(device): 9664: for input_dtype in get_all_math_dtypes(device): 13015: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13034: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13159: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17400: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17768: dtypesIfCUDA(get_all_fp_dtypes()) 17773: dtypesIfCUDA(get_all_fp_dtypes()) 17778: dtypesIfCUDA(get_all_fp_dtypes()) 17783: dtypesIfCUDA(get_all_fp_dtypes()) 17788: dtypesIfCUDA(get_all_fp_dtypes()) 17793: dtypesIfCUDA(get_all_fp_dtypes()) 17798: dtypesIfCUDA(get_all_fp_dtypes()) 17963: dtypesIfCUDA(get_all_fp_dtypes()) 17977: dtypesIfCUDA(get_all_fp_dtypes()) 18684: def test_cross_entropy_loss_prob_target_all_reductions(self, device): ``` </p> </details> <details> <summary> `test/test_numpy_interop.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import get_all_dtypes 399: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_ops.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes 86: for dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_reductions.py`</summary> <p> ```python 16: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 360: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 366: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 394: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 750: for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]: 1404: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1457: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1458: get_all_complex_dtypes())) 1465: return dtype in get_all_int_dtypes() 1494: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1501: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1507: dtypes((get_all_complex_dtypes())) 1514: dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)) 1523: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1531: if dtype in get_all_fp_dtypes(): 1608: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, 1837: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1855: dtypes((set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8})) 3219: for dtype in get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_serialization.py`</summary> <p> ```python 26:from torch.testing._internal.common_dtype import get_all_dtypes 586: for device, dtype in product(devices, get_all_dtypes()): 589: for other_dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_shape_ops.py`</summary> <p> ```python 18:from torch.testing._internal.common_dtype import get_all_dtypes 230: dtypes(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, 232: dtypesIfCUDA(get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False)) 344: dtypes(get_all_dtypes()) 443: dtypes(get_all_dtypes()) 461: dtypes(get_all_dtypes()) 570: dtypes(get_all_dtypes(include_complex=False)) ``` </p> </details> <details> <summary> `test/test_sort_and_select.py`</summary> <p> ```python 12: all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, 136: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 231: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 296: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 647: dtypesIfCUDA(get_all_fp_dtypes()) 678: dtypesIfCUDA((get_all_dtypes(include_complex=False, 682: dtypes((get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False))) 739: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 740: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) 799: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 800: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) ``` </p> </details> <details> <summary> `test/test_sparse.py`</summary> <p> ```python 20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes 29: floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes, 1963: return dtype in get_all_int_dtypes() 1994: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2103: return dtype in get_all_int_dtypes() 2138: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2626: all_sparse_dtypes = get_all_dtypes(include_complex=True) 2633: all_sparse_dtypes = get_all_dtypes(include_complex=True) 3230: dtypes(get_all_complex_dtypes(), 3231: get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 3234: get_all_fp_dtypes( ``` </p> </details> <details> <summary> `test/test_sparse_csr.py`</summary> <p> ```python 7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor 17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes 120: dtypes(get_all_dtypes()) 133: dtypes(get_all_dtypes()) 150: dtypes(get_all_dtypes()) 180: dtypes(get_all_dtypes()) 201: dtypes(get_all_dtypes()) 210: dtypes(get_all_dtypes()) 225: dtypes(get_all_dtypes()) 244: dtypes(get_all_dtypes()) 263: dtypes(get_all_dtypes()) 285: dtypes(get_all_dtypes()) 411: dtypes(get_all_dtypes()) 482: dtypes(get_all_dtypes()) 502: dtypes(get_all_dtypes()) 562: dtypes(get_all_dtypes()) 588: dtypesIfCUDA(get_all_complex_dtypes(), 589: get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater)) 745: dtypesIfCUDA(get_all_complex_dtypes(), 746: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 765: dtypesIfCUDA(get_all_complex_dtypes(), 766: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 801: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 841: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 1182: dtypes(get_all_dtypes()) 1276: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False)) 1286: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_tensor_creation_ops.py`</summary> <p> ```python 21: onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types) 23: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 150: for dt in get_all_dtypes(): 160: for dt in get_all_dtypes(): 314: dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16] 1012: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1013: get_all_complex_dtypes())) 1032: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1033: get_all_complex_dtypes())) 1050: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1051: get_all_complex_dtypes())) 1745: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1779: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1868: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1926: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1954: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 1956: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None) 1957: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 2538: for device in get_all_device_types(): 2645: for dtype in get_all_dtypes(): 2678: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False) + 2679: get_all_complex_dtypes())) 2716: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 2827: for dt in get_all_dtypes(): 2913: dtypes(get_all_dtypes(include_bool=False, include_half=False)) 2914: dtypesIfCUDA(get_all_dtypes(include_bool=False, include_half=True)) 3028: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3033: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3074: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_complex=False)) 3075: dtypesIfCUDA(((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16]) 3077: else get_all_dtypes(include_bool=False, include_half=True, include_complex=False))) 3873: dtypes(get_all_dtypes()) 3884: dtypes(get_all_dtypes(include_bool=False)) 3916: for other in get_all_dtypes(): 3922: dtypes(get_all_dtypes()) 3932: dtypes(get_all_dtypes(include_bool=False)) 3955: dtypes(get_all_dtypes(include_bool=False)) 3961: dtypes(get_all_dtypes(include_bool=False)) 3965: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_testing.py`</summary> <p> ```python 25:from torch.testing._internal.common_dtype import get_all_dtypes 31: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_torch.py`</summary> <p> ```python 51: expectedAlertNondeterministic, get_all_device_types, skipXLA) 57: get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes 296: for d in get_all_device_types(): 323: for device in get_all_device_types(): 324: for dt1 in get_all_dtypes(): 325: for dt2 in get_all_dtypes(): 343: all_dtypes = get_all_dtypes() 350: all_dtypes = get_all_dtypes() 781: for dtype in get_all_dtypes(): 986: for device in get_all_device_types(): 1017: for device in get_all_device_types(): 1018: for dtype in get_all_math_dtypes(device): 2792: for device in get_all_device_types(): 3186: dtypes(get_all_dtypes()) 3195: for error_dtype in get_all_dtypes(): 3203: dtypes(get_all_dtypes()) 3212: for error_dtype in get_all_dtypes(): 4539: dtypes(get_all_fp_dtypes()) 4545: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 4577: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 4578: dtypesIfCPU((get_all_fp_dtypes(include_half=False, include_bfloat16=True))) 4579: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4599: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4600: dtypesIfCPU((get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False))) 4601: dtypesIfCUDA((get_all_dtypes(include_bfloat16=False, include_complex=False))) 4613: for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False): 4628: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4629: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4640: dtypes(get_all_fp_dtypes()) 4723: dtypes(get_all_fp_dtypes()) 4735: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 4736: dtypesIfCUDA(get_all_fp_dtypes()) 4747: dtypes(get_all_fp_dtypes()) 4761: dtypes(get_all_fp_dtypes()) 4771: dtypes(get_all_fp_dtypes()) 4792: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 5302: dtypes(get_all_dtypes(include_bfloat16=False)) 5322: dtypes(get_all_dtypes(include_half=False, include_bfloat16=False)) 5323: dtypesIfCPU(get_all_dtypes(include_bfloat16=False)) 5324: dtypesIfCUDA(get_all_dtypes(include_bfloat16=False)) 5591: for dt in get_all_dtypes(): 5611: for dt in get_all_dtypes(): 5678: for dt in get_all_dtypes(): 5696: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 5697: dtypes(set(get_all_math_dtypes('cpu'))) 5746: dtypes(get_all_dtypes()) 5780: dtypes(get_all_dtypes()) 5885: dtypes(get_all_dtypes()) 5902: dtypes(get_all_dtypes()) 5945: dtypes(get_all_dtypes()) 5979: dtypes(get_all_dtypes(include_bool=False)) 6049: dtypes(get_all_dtypes(include_bool=False)) 6092: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6093: get_all_complex_dtypes())) 6094: dtypesIfCPU(get_all_dtypes()) 6095: dtypesIfCUDA(get_all_dtypes()) 6122: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6123: get_all_complex_dtypes())) 6124: dtypesIfCPU(get_all_dtypes()) 6125: dtypesIfCUDA(get_all_dtypes()) 6163: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6164: get_all_complex_dtypes())) 6165: dtypesIfCPU(get_all_dtypes()) 6166: dtypesIfCUDA(get_all_dtypes()) 6190: dtypes((get_all_complex_dtypes() + 6191: get_all_int_dtypes())) 6238: dtypes(get_all_dtypes()) 6323: dtypes(get_all_dtypes()) 6389: dtypes(product(get_all_dtypes(), (torch.uint8, torch.bool))) 6699: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 6700: dtypes(set(get_all_math_dtypes('cpu'))) 7452: dtypes(get_all_dtypes(include_bool=False)) 7461: dtypes(get_all_dtypes(include_bool=False)) 7477: dtypes(get_all_dtypes(include_bool=False)) 7496: dtypes(get_all_dtypes(include_bool=False)) 7538: dtypes(get_all_dtypes(include_bool=False)) 8162: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8163: get_all_complex_dtypes())) 8175: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8176: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_type_promotion.py`</summary> <p> ```python 14: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes 187: for dtype in get_all_dtypes(): 262: dtypes1 = get_all_math_dtypes('cuda') 263: dtypes2 = get_all_math_dtypes(device) 339: dtypes(itertools.product(get_all_dtypes(), get_all_dtypes())) 468: for dt1 in get_all_math_dtypes(device): 469: for dt2 in get_all_math_dtypes(device): 519: for dt1 in get_all_math_dtypes(device): 520: for dt2 in get_all_math_dtypes(device): 528: for dt in get_all_math_dtypes(device): 561: for dtype in get_all_dtypes(): 766: dtypes=get_all_math_dtypes(device)) 771: dtypes=get_all_math_dtypes(device)) 782: dtypes=get_all_math_dtypes(device)) 879: dtypes = get_all_dtypes(include_bfloat16=False) 898: dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False) 965: dtypesIfCUDA(itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False), 966: get_all_dtypes(include_bfloat16=False, include_complex=False))) 967: dtypes(itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False, 969: get_all_dtypes(include_half=False, include_bfloat16=False, 976: return dtype in get_all_int_dtypes() + [torch.bool] 979: return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False) ``` </p> </details> <details> <summary> `test/test_unary_ufuncs.py`</summary> <p> ```python 24: floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes, 25: get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 517: dtypes((get_all_int_dtypes() + [torch.bool] + 518: get_all_fp_dtypes(include_bfloat16=False))) 596: dtypes(get_all_fp_dtypes(include_half=True, include_bfloat16=False)) 611: invalid_input_dtypes = get_all_int_dtypes() + \ 612: get_all_complex_dtypes() + \ 619: for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False): 1048: dtypes(get_all_math_dtypes('cpu')) 1182: dtypesIfCUDA(get_all_fp_dtypes()) 1190: dtypesIfCUDA(get_all_fp_dtypes()) 1205: dtypesIfCUDA(get_all_fp_dtypes()) 1215: dtypesIfCUDA(get_all_fp_dtypes()) 1307: dtypes((get_all_dtypes(include_bool=False))) 1349: dtypes((get_all_fp_dtypes(include_half=False) + 1350: get_all_complex_dtypes())) 1351: dtypesIfCUDA((get_all_fp_dtypes(include_half=True) + 1352: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_view_ops.py`</summary> <p> ```python 19: get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 124: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 131: dtypes(get_all_dtypes(include_bfloat16=False)) 213: for view_dtype in [get_all_fp_dtypes(), get_all_complex_dtypes()]: 220: dtypes(get_all_dtypes()) 224: for view_dtype in get_all_dtypes(): 305: dtypes(get_all_complex_dtypes(include_complex32=True)) 343: dtypes(get_all_dtypes()) 354: dtypes(get_all_dtypes()) 364: dtypes(get_all_dtypes()) 374: dtypes(get_all_dtypes()) 384: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 395: dtypes(get_all_complex_dtypes()) 426: dtypes(get_all_complex_dtypes()) 451: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 1263: dtypes((torch.testing.get_all_dtypes())) 1279: dtypes((torch.testing.get_all_dtypes())) 1405: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1406: get_all_complex_dtypes())) 1471: dtypes(get_all_dtypes(include_bfloat16=False)) 1574: dtypes(get_all_dtypes()) 1601: dtypes(get_all_dtypes(include_bfloat16=False)) 1632: dtypes(*get_all_dtypes(include_bfloat16=False)) 1711: for dt in get_all_dtypes(): 1717: for dt in get_all_dtypes(): 1724: for dt in get_all_dtypes(): ``` </p> </details> I'm looking forward to your viewpoints. Thanks :) cc: mruberry kshitij12345 anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561 Reviewed By: samdow Differential Revision: D34856571 Pulled By: mruberry fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335 (cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)	2022-03-15 20:31:41 +00:00
Duncan Hill	0988dc481a	[Codemod][Codemod deprecated unittest asserts] fbcode//caffe2/test (#71708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71708 In Python 3.2, a number of asserts were deprecated. In Python 3.11, these asserts are deleted completely. The files in this change still use the deprecated asserts. Switch over to the supported syntax for 3.2 onwards. Test Plan: Tested on the internal test suite runner. Reviewed By: ajtulloch Differential Revision: D33503694 fbshipit-source-id: a150f296033260acf8365d77b837ce0679f57361 (cherry picked from commit abf60ed97409265222915d8265aaabedd625fd93)	2022-03-15 19:28:52 +00:00
Alban Desmaison	dff02851d1	Update tls logic to work better with guarded call (#73925 ) Summary: Description of the new behavior is in PythonFallbackKernel.cpp. The updated test makes sure that we only call alias on the first Tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73925 Reviewed By: samdow Differential Revision: D34862940 Pulled By: albanD fbshipit-source-id: 4d020e41c8bb8b10262dcafd524e84a5ad4d7af0 (cherry picked from commit 0aa6b56dbd3dcee830453fb02cd6c83ab7a8be06)	2022-03-14 21:26:31 +00:00
Alban Desmaison	b2a5507654	Fix deadlock in some edge case in autograd (#73961 ) Summary: Minimal example that deadlocks before but not after: ```python import torch from torch.autograd import Function class Foo(Function): staticmethod def forward(ctx, x): return x.clone() staticmethod def forward(ctx, gO): return gO.clone() def get_out(): inp = torch.rand(2, requires_grad=True) # The python function is first so that it runs # last in the backward pass right = Foo.apply(inp) # An op that creates new memory left1 = inp.clone() # An op that saves its input left2 = left1 ** 2 # Inplace modify so that the backward for # left2 always raises an error left1 += 1 # An op that takes both side as input. # After running, both side's last op will be in # the ready queue # And the op for left will run first as it was # executed last during the forward out = left2 + right return out # Nothing should be global variables here as, from what # I can see, python leaks all the global objects get_out().sum().backward() ``` Since this requires the python interpreter to die, it is hard to test in CI. Let me know if you have an idea how to do it though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73961 Reviewed By: malfet Differential Revision: D34752747 Pulled By: albanD fbshipit-source-id: 1a537b1f733e161e8d3ff053cd432b37b34d432a (cherry picked from commit 17943e4c04c782d81deab439e010195f04e75bbd)	2022-03-09 20:42:15 +00:00
soulitzer	15df909d34	Move autograd functional tests to separate file (#73852 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73852 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D34703586 Pulled By: soulitzer fbshipit-source-id: 58e8b17ab3dc41ce7bf15bb32ea0653d90f44791 (cherry picked from commit 526ab20fd6026144171bf3b02a5381da57ca9f91)	2022-03-08 23:45:34 +00:00
Peter Bell	9ef5c679ef	record_function: add torchbind alternative API (#72301 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72301 First step in resolving #35026. This adds `PythonRecordFunction` which is a `torch::CustomClassHolder` for `at::RecordFunction` to keep the ATen code free of torch includes. And adds new unused internal API functions `_record_function_enter_new` which return the torchbind object. Once the FC period is expired, `torch.profiler.record_function` will be updated to use this new internal API. Then once BC period is expired, the cpp_custom_type_hack-based API can be removed. Test Plan: Imported from OSS Reviewed By: dagitses Differential Revision: D34586311 Pulled By: robieta fbshipit-source-id: d3eb9ffad7b348548a2b22c75203a92d1cb5115b (cherry picked from commit 92d2ca808e5fbd20c9d6645dcabc3f059f9ef2d3)	2022-03-08 03:26:27 +00:00
anjali411	086645ad77	Update __torch_dispatch__ to return op overload instead of the opoverload packet function (#72673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72673 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D34627164 Pulled By: anjali411 fbshipit-source-id: 3cb6406a392d530bf9da36b4d8e0a62b30e6497e (cherry picked from commit 65b85a0a67df4d0f16ac8964e2b685d478a610fb)	2022-03-07 22:38:42 +00:00
Philip Meier	b5f2574f36	no longer coalesce sparse COO tensors before comparison (#69751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69751 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262453 Pulled By: ezyang fbshipit-source-id: e2e62d2aa03fc569d2951c880960b256f5dc4aaa (cherry picked from commit `cb6b0ef719`)	2022-02-17 02:33:08 +00:00
Alban Desmaison	a877441494	Clean up use of cpu ready queue in autograd engine (#72688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72688 Refactor how we know what to run on the cpu queue. The Lazy Tensor moved there as it is always present as a device guard and would make the number of devices 1 all the time (forcing the creation of a thread). FYI wconstab you most likely don't care about this unless you ever use multiple Lazy device? This should slightly improve the perf if you run backward with Lazy Tensors as the work will be done in the main thread and not a worker thread. Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D34180245 Pulled By: albanD fbshipit-source-id: 88c5d5bdd631ad01bf271d720d1eab69aba84fc0 (cherry picked from commit `da7e9b902f`)	2022-02-12 01:52:56 +00:00
Ivan Yashchuk	fb7c4780f9	Add autograd tests for addmm, addmv, mm, mv and CSR matrix input (#71949 ) Summary: This PR adds autograd tests for `addmm, addmv, mm, mv` functions that check computing derivatives wrt dense inputs. Currently, neither autograd engine, nor gradcheck can work with CSR inputs<->CSR outputs. I added xfailing tests for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71949 Reviewed By: george-qi Differential Revision: D33834653 Pulled By: cpuhrsch fbshipit-source-id: 4144c1547427d4cd6b01495cf45242bb4e914e86 (cherry picked from commit `2cb362283d`)	2022-02-11 23:14:02 +00:00
soulitzer	91e4f7788c	Gradcheck forward AD respects requires grad but run with requires_grad=False (#72309 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72309 Fixes: https://github.com/pytorch/pytorch/issues/72113 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33991570 Pulled By: soulitzer fbshipit-source-id: 610de162e9848d2d3b12e0fb039860fd9dee844f (cherry picked from commit `a7ecb13610`)	2022-02-10 03:30:40 +00:00
soulitzer	e39bf13316	Fix internal assert custom function when input does not require grad (#72008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72008 Fixes #71119 Technically BC-breaking because when an input does not require grad, previously it was returned as-is instead of a view because it didn't need to. Now we will also return a view in that case (whether or not forward AD runs). Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33859553 Pulled By: soulitzer fbshipit-source-id: 81b3fa371f4c0904630878500aa190492c562367 (cherry picked from commit `ee74bc8234`)	2022-02-01 22:36:04 +00:00
Richard Zou	5735f2f875	Make detach redispatch like a regular PyTorch operator (#71707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71707 Why? - detach should behave like jax.stop_gradient in functorch. Because it does not detach all the way through, functorch (as well as a Tensor Subclass wrapping a Tensor subclass) won't see it after the first layer/subclass handles it. How? - This PR changes detach to dispatch all the way through to the backend. - This PR also modifies native::detach to call shallow_copy_and_detach instead of native::alias. This is because today, the semantics of detach and alias are differently -- they differ only by allow_tensor_metadata_change. In the future, we may choose to deprecate this flag. - NB: Before and after this PR, detach() shows up twice in torch_dispatch: https://github.com/pytorch/pytorch/issues/71725. This is not a regression so I didn't want to fix it in this PR because it is weird to fix. Test Plan: - added new tests; run existing tests Reviewed By: albanD Differential Revision: D33752860 Pulled By: zou3519 fbshipit-source-id: 40cc2dc8232e75a02586a4ba5b0ef5f16cb76617 (cherry picked from commit `f88aae426e`)	2022-01-28 16:13:36 +00:00
lezcano	84f1685397	Rewrite svd and linalg.svd as structured kernels (#69827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69827 In general, the current pattern allows for implementing optimisations for all the backends in a common place (see for example the optimisation for empty matrices). After this PR, `torch.svd` is implemented in terms of `linalg.svd` and `linalg.svdvals`, as expected. This makes it differentiable in the case when `compute_uv=False`, although this is not particularly important, as `torch.svd` will eventually be deprecated. This PR also instantiates smaller `U` / `V` when calling cusolver_gesvdj in the cases when `full_matrices=False` or `compute_uv=False`. The memory for auxiliary `U` and `V` in the cases above, needed for some cuSOLVER routines is allocated raw allocators rather than through fully fledged tensors, as it's just a blob of memory the algorithm requests. As the code is better structured now, it was easier to see that `U` and `Vh` needn't be allocated when calling `svd_cusolver_gesvd`. Now `linalg.svdvals` work as expected wrt the `out=` parameter. Note that in the test `test_svd_memory_allocation` we were passing a tensor of the wrong size and dtype and the test seemed to pass... This PR also changes the backward formula to avoid saving the input matrix, as it's not necessary. In a follow up PR, I will clean the backward formula and make it more numerically stable and efficient. This PR also does a number of memory optimisations here and there, and fixes the call to cusolver_gesvd, which were incorrect for m <= n. To test this path, I compiled the code with a flag to unconditionally execute the `if (!gesvdj_convergence_check.empty())` branch, and all the tests passed. I also took this chance to simplify the tests for these functions in `test_linalg.py`, as we had lots of tests that were testing some functionality that is already currently tested in the corresponding OpInfos. I used xwang233's feature to test both MAGMA and CUDA backends. This is particularly good for SVD, as cuSOLVER is always chosen over MAGMA when available, so testing MAGMA otherwise would be tricky. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33751983 Pulled By: mruberry fbshipit-source-id: 11d48d977946345583d33d14fb11a170a7d14fd2 (cherry picked from commit `a1860bd567`)	2022-01-27 18:38:30 +00:00
anjali411	de8d0203e9	Allow torch.Tensor.real on real-valued tensors (#71718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71718 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33770668 Pulled By: anjali411 fbshipit-source-id: bad21ebe72220b9017a0b8efa71eaeab84bd9e9f (cherry picked from commit `aa0a922757`)	2022-01-25 22:30:48 +00:00
soulitzer	7a0c97195f	Add save_for_forward to custom function (#71569 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71569 Not sure if this is the right API Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695395 Pulled By: soulitzer fbshipit-source-id: 652b5758f15d901f98ff0da94e977030c7f3415b (cherry picked from commit `9421a6846a`)	2022-01-25 07:30:46 +00:00
soulitzer	09aeadf4ab	Fix custom function forward AD internal assert (#71531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71531 Based on the comment above the original internal assert, this is the desired check. 1. Don't error, and automatically make jvp return a view for that tensor output (this is easier than I originally thought: https://github.com/pytorch/pytorch/pull/71531#discussion_r789211877) 2. Error (currently doing) Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695399 Pulled By: soulitzer fbshipit-source-id: dba49890a55ad1dd59ed5c41faa96bf7cfc9e562 (cherry picked from commit `fdb0f266f5`)	2022-01-25 07:30:46 +00:00
soulitzer	1cc3291716	Fix custom function when non tensor argument precedes tensor argument (#71530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71530 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695397 Pulled By: soulitzer fbshipit-source-id: 49ccd062f73ccf69c47aca2552fde182d582be2a (cherry picked from commit `68d502a013`)	2022-01-25 07:30:46 +00:00
Victor Quach	a3b7dd7b78	Enable nested default hooks (#70932 ) Summary: When default hooks are set, they are pushed onto a stack. When nesting context-manager, only the inner-most hooks will be applied. There is special care needed to update the TLS code. See also https://github.com/pytorch/pytorch/issues/70940 (i.e. do we need to be storing the enabled flag as well?) Fixes https://github.com/pytorch/pytorch/issues/70134 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70932 Reviewed By: mruberry Differential Revision: D33530370 Pulled By: albanD fbshipit-source-id: 3197d585d77563f36c175d3949115a0776b309f4	2022-01-11 15:03:49 -08:00
soulitzer	7397683b57	Add forward AD formulas for mv, scatter_add, _s_where (#70468 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70468 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33405364 Pulled By: soulitzer fbshipit-source-id: 7681c33fb264a7a3ec6436ebb7c5bb07cd5ffc3d	2022-01-10 13:54:10 -08:00
Mike Ruberry	84b7832010	Updates CUDA memory leak check to verify against driver API and print more diagnostic information (#69556 ) Summary: Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/69556 Reviewed By: mrshenli Differential Revision: D32954770 Pulled By: mruberry fbshipit-source-id: a6c2ae6f704422c178569980ca4b9c72c4272f55	2021-12-17 23:37:49 -08:00
soulitzer	51033ec840	Add forward AD layout check for storage numel (#68631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68631 This PR: - Adds the check that the storage numel of the base and tangent tensors are the same. This is to support the case when as_strided reveals elements that aren't indexable by the input tensor. - Skips the check when batched tensors are involved, because using as_strided to reveal elements that not indexable by the input tensor is already not allowed vmap. - Adds tests for the above two cases, as well as an edge case regarding conj bit (what about neg bit?) For functorch: - we need to copy the batching rule implemented here Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32899678 Pulled By: soulitzer fbshipit-source-id: 54db9550dd2c93bc66b8fb2d36ce40799ebba794	2021-12-14 04:34:25 -08:00
soulitzer	af7ee9fc01	Forward AD for inplace comparison operators (#69597 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69597 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33020600 Pulled By: soulitzer fbshipit-source-id: 0c9ab210f7dc952a41fbcaa1f5f7921c2fdeb18b	2021-12-12 00:11:14 -08:00
soulitzer	baf92f9d5a	Fix copy_ forward AD to handle broadcasting (#69592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69592 Currently, forward AD function for`copy_` (in `VariableTypeManual`) does not handle the broadcasting case. ~EDIT: but that is not a design decision, not a bug. In this PR, we make that clear as a comment.~ Note: `broadcast_to` does not have a batching rule in core, so the ops that rely on `copy_` to broadcast will still fail batched forward grad computation. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33020603 Pulled By: soulitzer fbshipit-source-id: 09cb702bffc74061964a9c05cfef5121f8164814	2021-12-12 00:11:08 -08:00
milesial	0ccb1dcdbb	Fix inference_mode decorator (#68617 ) Summary: This fixes the case when `torch.inference_mode` is called with `mode=False` (disabled). When used as a decorator, it ignored the argument and enabled inference mode anyway. `_DecoratorContextManager` is changed so that a new instance is a copy instead of a new instance with default parameters. I also added more tests to cover this case. Current behaviour: ```python >>> import torch >>> x = torch.ones(1, 2, 3, requires_grad=True) >>> torch.inference_mode(mode=False) ... def func(x): ... return x * x ... >>> out = func(x) >>> out.requires_grad False ``` New behaviour (fixed): ```python >>> import torch >>> x = torch.ones(1, 2, 3, requires_grad=True) >>> torch.inference_mode(mode=False) ... def func(x): ... return x * x ... >>> out = func(x) >>> out.requires_grad True ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/68617 Reviewed By: mrshenli Differential Revision: D32958434 Pulled By: albanD fbshipit-source-id: 133c69970ef8bffb9fc9ab5142dedcffc4c32945	2021-12-09 10:45:09 -08:00
soulitzer	b61c532f96	Make make_dual redispatch (#68630 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68630 Constraints: 1) (functorch) if all the inputs to an op have requires_grad=False and don't have tangents, then their VariableType kernel should be a no-op i.e., behave like a redispatch. This is due to functorch's DynamicLayerStack having the autograd key by default (which is so that transformations like vmap) still work with autograd 2) (inference mode) inference tensors in inference mode will call straight into the kernel, we should still do something sensible inside even if we normally wouldn't redispatch into it. 3) ~Should support potential application of interposition below autograd: `nn.Parameter` is a example of subclassing where the subclass is not preserved when an operation is performed. There is an exception though: we want calling `make_dual` on a `nn.Parameter` to preserve its parameterness.~ 4) Should avoid calls to shallow_copy_and_detach to avoid spurious calls into `__python_dispatch__`. This PR: - does not redispatch to `make_dual` from its `ADInplaceOrView` kernel to satisfy (1) - calls into `alias` from the kernel in the native namespace so that behavior is consistent with other views in inference mode to satisfy (2) - discussion of (3). We still wouldn't be able to directly override `make_dual` below autograd. In this PR, instead of not redispatching at all, we choose to redispatch into `at::alias` so that one can override `make_dual`. The side effect is that one would not be able to distinguish calls between the two, which can be problematic (though a straightforward but hacky solution would be to create a new `at::alias_for_make_dual` that would allow users to distinguish) the two. This isn't ideal but seems to be the simplest way to satisfy (3). We don't pursue that hacky solution here. - (4) is satisfied because we remove calls to `shallow_copy_and_detach` <details> <summary> A potentially less hacky but more involved solution? (WIP) </summary> Realizing that make_dual is more like requires_grad, perhaps it shouldn't be autograd explicit? Make make_dual a composite or python-only construct. i.e., it would be a view on the primal followed by something to the effect of primal.set_fw_grad(tangent). Additional constraints: 5) make_dual needs to be backward-differentiable (I can't think of any applications yet becuase technically as a high-order function, jvp's input is the tangent only, "detach" is not applied on the tangent, so one would still be able to propagate gradients through it). 6) set_fw_grad needs to raise an error if there is a layout mismatch and base is a forward-differnentiable view Possible plan - (6) implies that a plain view would not suffice. We need a `detach`-like operation to ensure that set_fw_grad knows the view is not forward differentiable. - (5) implies that is this (new) `detach` would need to be backward differentiable (API TBD). - (3) is no longer relevant because make_dual is no longer autograd explicit, but perhaps this new detach should behave like the current one? There is a lot of logic to replicate for detach, so this may be hard. - (1) is satisfied if we use current detach logic, i.e., , and (4) is trivial. I'm not convinced that this is the right solution either, because in the end does (3) still work? </details> Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32899679 Pulled By: soulitzer fbshipit-source-id: 98e13ae954e14e1e68dbd03eb5ab3300d5ed2c5e	2021-12-08 17:56:03 -08:00
Rohan Varma	049debd97d	[Reland][Autograd/Checkpoint] Checkpoint implementation without reentrant autograd (#69508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69508 Original Phabricator Diff: D32704467 (`e032dae329`) Reland, fix is to not test traditional checkpoint when input does not require grad as that is unsupported as documented. Original PR body: Resubmission of https://github.com/pytorch/pytorch/pull/62964 with the suggestions and tests discussed in https://github.com/pytorch/pytorch/issues/65537. Adds a `use_reentrant=False` flag to `checkpoint` function. When `use_reentrant=True` is specified, a checkpointing implementation that uses SavedVariableHooks instead of re-entrant autograd is used. This makes it more composable with things such as `autograd.grad` as well as DDP (still need to add thorough distributed testing). As discussed in https://github.com/pytorch/pytorch/issues/65537, the tests that we need to add are: - [x] Gradient hooks are called once - [x] works when input does require grads but Tensor that require grads are captures (like first layer in a nn) - [x] works for functions with arbitrary input/output objects - [x] distributed tests (next PR) Note that this is only for `torch.utils.checkpoint`, if this approach overall looks good, we will do something similar for `checkpoint_sequential`. ghstack-source-id: 144948501 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D32902634 fbshipit-source-id: 2ee87006e5045e5471ff80c36a07fbecc2bea3fe	2021-12-07 16:31:23 -08:00
Michael Suo	59e98b66ac	Revert D32704467: [Autograd/Checkpoint] Checkpoint implementation without reentrant autograd Test Plan: revert-hammer Differential Revision: D32704467 (`e032dae329`) Original commit changeset: 6eea1cce6b93 fbshipit-source-id: 1a788c1fd57cee46bba82e216e6162d078359cc2	2021-12-06 16:33:32 -08:00
Rohan Varma	e032dae329	[Autograd/Checkpoint] Checkpoint implementation without reentrant autograd (#69027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69027 Resubmission of https://github.com/pytorch/pytorch/pull/62964 withe suggestions and tests discussed in https://github.com/pytorch/pytorch/issues/65537. Adds a `use_reentrant=False` flag to `checkpoint` function. When `use_reentrant=True` is specified, a checkpointing implementation that uses SavedVariableHooks instead of re-entrant autograd is used. This makes it more composable with things such as `autograd.grad` as well as DDP (still need to add thorough distributed testing). As discussed in https://github.com/pytorch/pytorch/issues/65537, we have added the following tests: -[ ] Gradient hooks are called once ghstack-source-id: 144644859 Test Plan: CI Reviewed By: pbelevich Differential Revision: D32704467 fbshipit-source-id: 6eea1cce6b935ef5a0f90b769e395120900e4412	2021-12-06 13:29:37 -08:00
soulitzer	5456d8c8f3	Add vectorized Jacobian and Hessian computation with forward AD (#67041 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67041 Original PR here: https://github.com/pytorch/pytorch/pull/62246 (The old PR does more things, but now that's split across this stack) This PR: - Adds "jacfwd" and "hessian_fwdrev" - Modifies existing tests to also test the `forward_ad=True` case Test Plan: Imported from OSS Reviewed By: gchanan, zou3519 Differential Revision: D32314424 Pulled By: soulitzer fbshipit-source-id: 785b0e39162b93dc3b3cb9413233447152eddd53	2021-11-19 14:31:09 -08:00
soulitzer	e358c49a5b	Add OpInfo test and fix a couple cases (#66294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66294 In this PR: - OpInfo for forward AD now checks batched forward grad when `op.check_batched_grad=True` - Adds setting to disable the test for individual ops `check_batched_forward_grad` and disable for the ops here: https://github.com/pytorch/pytorch/issues/66357 Fixes some more failures: - Make Forward AD metadata less strict by allowing stride to differ when size is 1 - Fix sum batching rule when logical tensor is a scalar and dim is unspecified - Batching rule for `_reshape_alias` - ~Batching rules now preserve storage offset for view operator that return non-zero storage offset~ (moved to previous PR) Test Plan: Imported from OSS Reviewed By: zou3519, albanD Differential Revision: D31842020 Pulled By: soulitzer fbshipit-source-id: 3517a8fb9d6291fccb53c0b1631eab5bbb24ebd1	2021-11-19 14:31:03 -08:00
soulitzer	2455cc2adf	Address case when layout of tangent is not same as base (#66292 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66292 In this PR: 1. Fix the case when tangent has a different layout from the base when `set_fw_grad` by adding a native function and its batching rule. For (1) we replace the following: ``` Tensor new_with_same_meta(const Variable& base) { int64_t nelement_in_storage = base.storage().nbytes() / base.itemsize(); auto new_tensor = at::zeros({nelement_in_storage}, base.options()); auto res = new_tensor.as_strided(base.sizes(), base.strides(), base.storage_offset()); return res; } ``` with a native function as to enable a batching rule to alter its behavior. This new function will be similar to `new_zeros_strided` except we also require the `storage_offset` and `storage_numel` arguments. Possible concerns: - Why have redundant logic? Why not add new args `new_zeros_strided`? This is probably a niche use case, so it's better not to complicate the current API. - Previously the created tensor inherits the TensorOptions of the primal. Now we inherit from the TensorOptions of the tangent. - Probably fine. Likely, no one relies on this because the behavior is only triggered when tangent/base have different layouts. - Why pass in exploded size, stride, and offset? It is possible in the non-batched case to pass in a tensor directly, but not possible when we'd like to have a batching rule. The size, stride, and offset we'd be passing won't belong to any live tensor. Test Plan: Imported from OSS Reviewed By: zou3519, albanD Differential Revision: D31842019 Pulled By: soulitzer fbshipit-source-id: a58433d814fd173bc43a2c550b395377dba40de2	2021-11-19 14:29:46 -08:00
soulitzer	913ac27112	Fixes forward AD codegen for multiple formulas (#68535 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67367 - Adds check to make sure forward grad itself does not have forward grad at the same level - Verify with `python test/test_ops.py -k test_forward_mode_AD_linalg_eigh_cpu_float64` that it fails the check before, but passes after the codegen update Before: ``` if (_any_has_forward_grad_eigenvalues) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); auto eigenvalues_new_fw_grad = eigh_jvp_eigenvalues(self_t, eigenvalues, eigenvectors); if (eigenvalues_new_fw_grad.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvalues._set_fw_grad(eigenvalues_new_fw_grad, /* level / 0, / is_inplace_op / false); } } if (_any_has_forward_grad_eigenvectors) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); auto eigenvectors_new_fw_grad = eigh_jvp_eigenvectors(self_t, eigenvalues, eigenvectors); if (eigenvectors_new_fw_grad.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvectors._set_fw_grad(eigenvectors_new_fw_grad, / level / 0, / is_inplace_op / false); } } ``` After: ``` c10::optional<at::Tensor> eigenvalues_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_eigenvalues) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); eigenvalues_new_fw_grad_opt = eigh_jvp_eigenvalues(self_t, eigenvalues, eigenvectors); } c10::optional<at::Tensor> eigenvectors_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_eigenvectors) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); eigenvectors_new_fw_grad_opt = eigh_jvp_eigenvectors(self_t, eigenvalues, eigenvectors); } if (eigenvalues_new_fw_grad_opt.has_value() && eigenvalues_new_fw_grad_opt.value().defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvalues._set_fw_grad(eigenvalues_new_fw_grad_opt.value(), / level / 0, / is_inplace_op / false); } if (eigenvectors_new_fw_grad_opt.has_value() && eigenvectors_new_fw_grad_opt.value().defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvectors._set_fw_grad(eigenvectors_new_fw_grad_opt.value(), / level / 0, / is_inplace_op */ false); } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/68535 Reviewed By: ngimel Differential Revision: D32536089 Pulled By: soulitzer fbshipit-source-id: a3f288540e2d78a4a9ec4bd66d2c0f0e65dd72cd	2021-11-18 17:44:17 -08:00
soulitzer	22e73f616c	Update unpack_dual to return named tuple (#68062 ) Summary: Also updates the doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/68062 Reviewed By: gchanan Differential Revision: D32315089 Pulled By: soulitzer fbshipit-source-id: 567c812da093daeb6549b0dc7ecbffd58eb8ccc2	2021-11-10 14:14:55 -08:00
Nikita Vedeneev	db456d16ee	`torch.lobpcg.backward`: do not save non-Variable types with `ctx.save_for_backward`. (#67994 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67827 cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/67994 Reviewed By: H-Huang Differential Revision: D32244818 Pulled By: albanD fbshipit-source-id: 702a3a1d1f4c160bef7ec1f764a2ab5d01ca7901	2021-11-08 10:02:09 -08:00
soulitzer	823ae3a4ff	[forward ad] Also check layout of grad matches that of self for inplace over view (#67816 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67800 Currently when the grad is the same layout as base, we try to assign the same tensor to the forward grad of both the base and the view. However, when the layout of the grad is different from the layout of the view, this triggers a copy to be created, and the tangent of the view (after the inplace) will not have a view relationship with the view of the base. This PR just changes it so that we only do the above optimization when the layout also matches the layout of self Pull Request resolved: https://github.com/pytorch/pytorch/pull/67816 Reviewed By: malfet Differential Revision: D32190021 Pulled By: soulitzer fbshipit-source-id: b1b2c9b332e83f4df5695ee9686ea76447f9305b	2021-11-05 10:26:24 -07:00
soulitzer	83e8612d11	Clean up test autograd (#67413 ) Summary: Partially fixes https://github.com/pytorch/pytorch/issues/66066 This PR: - cleans up op-specific testing from test_autograd. test_autograd should be reserved for testing generic autograd functionality - tests related to an operator are better colocated - see the tracker for details What to think about when moving tests to their correct test suite: - naming, make sure its not too generic - how the test is parametrized, sometimes we need to add/remove a device/dtype parameter - can this be merged with existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/67413 Reviewed By: jbschlosser, albanD Differential Revision: D32031480 Pulled By: soulitzer fbshipit-source-id: 8e13da1e58a38d5cecbfdfd4fe2b4fe6f816897f	2021-11-03 15:26:09 -07:00
kshitij12345	885a8e53ba	replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201 ) Summary: Reference https://github.com/pytorch/pytorch/issues/53849 Replace `onlyOnCPUandCUDA` with `onlyNativeDeviceTypes` which includes `cpu, cuda and meta`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65201 Reviewed By: mrshenli Differential Revision: D31299718 Pulled By: mruberry fbshipit-source-id: 2d8356450c035d6a314209ab51b2c237583920fd	2021-11-01 09:22:34 -07:00
albanD	b27b1ff809	Fix deadlock when forward and backward AD are used at the same time (#67360 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67360 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D31973040 Pulled By: albanD fbshipit-source-id: f9c75c6497b622c86e8653027bce45461304eff5	2021-10-28 09:11:36 -07:00
soulitzer	f2f7b02b4c	Add support for vmap+fwdAD for basic out-of-place op (#66291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66291 In this PR: - Trivial batching rules for `make_dual` and `is_same_size` that enable forward ad + vmap functionality - Adds a check in gradcheck that is performed when both `check_batched_grad` and `check_forward_ad` are `True` (an OpInfo using this is added later in the stack). - Tests for the gradcheck functionality - Tests that basic out-of-place op works Test Plan: Imported from OSS Reviewed By: albanD, saketh-are Differential Revision: D31842018 Pulled By: soulitzer fbshipit-source-id: 84b18d9a77eeb19897757e37555581f2a9dc43d8	2021-10-27 08:55:06 -07:00
soulitzer	892ac08a02	Do not generate not_implemented error for forward AD when input with tangent passed to non-differentiable function (#66926 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/61926 1. update the `if` to just use requires_derivative since that should reflect when function is not differentiable 2. if `requires_derivative=True` but no outputs have forward derivatives, we should error as usual 3. ~In the future we may also want to handle the case~ when `len(fw_derivatives) > 0 and len(fw_derivatives) < num_diff_outputs` we should add assert in codegen that this does not happen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/66926 Reviewed By: anjali411 Differential Revision: D31810736 Pulled By: soulitzer fbshipit-source-id: 11a14477cc7554f576cff2ed1711a448a8c6a66a	2021-10-21 13:53:07 -07:00
Jane Xu	299a6a65b2	[skip ci] Set test owners for autograd tests (#66834 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66834 Reviewed By: albanD Differential Revision: D31761778 Pulled By: janeyx99 fbshipit-source-id: 355edfb1b940154e84fbba6f7b096605e75ae459	2021-10-19 08:35:02 -07:00
lezcano	0974215c4d	Prefer mT and mH over transpose(-2, -1) and transpose(-2, -1).conj() (#64181 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64181 This PR replaces all the calls to: - `transpose(-2, -1)` or `transpose(-1, -2)` by `mT()` in C++ and `mT` in Python - `conj().transpose(-2, -1)` or `transpose(-2, -1).conj()` or `conj().transpose(-1, -2)` or `transpose(-1, -2).conj()` by `mH()` in C++ and `mH` in Python. It also simplifies two pieces of code, and fixes one bug where a pair of parentheses were missing in the function `make_symmetric_matrices`. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D31692896 Pulled By: anjali411 fbshipit-source-id: e9112c42343663d442dc5bd53ff2b492094b434a	2021-10-18 13:02:25 -07:00
Peter Bell	5f45927d15	Autograd: Delay warnings until the end of backward execution (#66235 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50209 This adds a new warning handler that stores all warnings in a shared queue, which can be "replayed" at a later time and, crucially, on another thread. Then, I use this inside the autograd engine to ensure that warnings are processed by the handler registered on the main thread. For testing, I also add an operator that always warns in the backward pass and test that the warning is a normal Python warning. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66235 Reviewed By: ejguan Differential Revision: D31505413 Pulled By: albanD fbshipit-source-id: 1a7f60b038f55c20591c0748b9e86735b3fec2f9	2021-10-13 15:38:04 -07:00
soulitzer	73901b099d	Add batched_grad parameter to `autograd.grad` (#65564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65564 - wrap the call into engine with vmap if `batched_grad` is `True` - improves the comment on the call to engine (somewhat addressing https://github.com/pytorch/pytorch/issues/41659) - borrows the message from functional.jacobian's vectorized argument concerning usage of the vmap feature - adds basic test (further testing is done when we replace the usage in vectorized jacobian computation) TODO: - create an issue tracking this Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D31236259 Pulled By: soulitzer fbshipit-source-id: b33e6b26ea98fa9f70c44da08458fc54ba4df0f7	2021-10-03 19:55:06 -07:00
soulitzer	91611fe1d1	Decouple forward AD checks from backward AD in OpInfo tests and gradcheck (#65040 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/64999 - Adds a flag to gradcheck `check_backward_ad` that can be used to disable gradcheck for backward ad - This is a bit bc-breaking in terms of positional args, but I prefer this ordering - In OpInfo tests for forward ad: - set `check_backward_ad` False - In test_ops treat `supports_autograd` as if it is `supports_backward_ad` (it basically already is) - the only modification needed is to no longer skip forward ad tests if `supports_autograd` is false - test_dtype, test_variant_consistency, etc behave correctly as-is - In a follow-up PR, we can rename it to actually be `supports_backward_ad` - Testing - https://github.com/pytorch/pytorch/pull/65060 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65040 Reviewed By: albanD Differential Revision: D31238177 Pulled By: soulitzer fbshipit-source-id: f068d4cbe7ffb094930b16cddb210583b9b7b2c4	2021-09-29 17:01:34 -07:00
Yukio Siraichi	c829cb6840	Port `min` kernel to structured kernels. (#61450 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61450 Tracking issue: #55070 Test Plan: Imported from OSS Reviewed By: saketh-are Differential Revision: D29741713 Pulled By: bdhirsh fbshipit-source-id: 2c107752a90fd39cfb55e08aaf3541bd484a5fc3	2021-09-28 14:03:54 -07:00
soulitzer	4bf7959de2	Remove `run_functional_checks` from `test_autograd` and create necessary OpInfos (#64993 ) Summary: OpInfo tracker: https://github.com/pytorch/pytorch/issues/54261 - Eliminate duplicated testing logic in test_autograd - Moved tests that rely on this testing logic to use OpInfos - `cat` already has OpInfo (no action needed) - Created OpInfo for `block_diag` and `broadcast_tensors` Running into some FX errors. Added op to skip-list and created an issue here: https://github.com/pytorch/pytorch/issues/64997 Both `block_diag` and `broadcast_tensors` are variadic, so skipping `test_variant_consistency_jit` (from comments on other OpInfos, it looks like JIT does not support variadic tensors) Pull Request resolved: https://github.com/pytorch/pytorch/pull/64993 Reviewed By: jbschlosser Differential Revision: D30961736 Pulled By: soulitzer fbshipit-source-id: e169305384a683acae1178c4e12e9e214a67226a	2021-09-15 12:45:38 -07:00
Victor Quach	8131bc85d0	Raise TypeError on assigned grad with wrong type (#64876 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/64813 Raises a TypeError when assigned value to a grad is not a Tensor or None. Adds tests. cc ezyang gchanan Pull Request resolved: https://github.com/pytorch/pytorch/pull/64876 Reviewed By: anjali411 Differential Revision: D30901678 Pulled By: soulitzer fbshipit-source-id: dbb3cb5fd0bbac6918e0b2e2f51d340daa43dee0	2021-09-13 16:41:45 -07:00
kshitij12345	2c351c76e0	[special] Alias igamma, igammac to special.gammaninc, special.gammaincc (#61902 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/50345 Also added relevant OpInfo TODO: * [x] Check rendered docs gammainc : https://docs-preview.pytorch.org/61902/special.html#torch.special.gammainc * [x] Check rendered docs gammaincc: https://docs-preview.pytorch.org/61902/special.html#torch.special.gammaincc Pull Request resolved: https://github.com/pytorch/pytorch/pull/61902 Reviewed By: ngimel Differential Revision: D30761428 Pulled By: mruberry fbshipit-source-id: 06a16432873357958d53364f12a4e91c29779d26	2021-09-07 15:31:26 -07:00
Philip Meier	26b7ff5aea	deprecate dtype getters from `torch.testing` namespace (#63554 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63554 Following https://github.com/pytorch/pytorch/pull/61840#issuecomment-884087809, this deprecates all the dtype getters publicly exposed in the `torch.testing` namespace. The reason for this twofold: 1. If someone is not familiar with the C++ dispatch macros PyTorch uses, the names are misleading. For example `torch.testing.floating_types()` will only give you `float32` and `float64` skipping `float16` and `bfloat16`. 2. The dtype getters provide very minimal functionality that can be easily emulated by downstream libraries. We thought about [providing an replacement](https://gist.github.com/pmeier/3dfd2e105842ad0de4505068a1a0270a), but ultimately decided against it. The major problem is BC: by keeping it, either the namespace is getting messy again after a new dtype is added or we need to somehow version the return values of the getters. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D30662206 Pulled By: mruberry fbshipit-source-id: a2bdb10ab02ae665df1b5b76e8afa9af043bbf56	2021-09-07 08:58:51 -07:00
Anirudh Dagar	1a1fb31cfa	Support `torch.concat` alias, add `cat` OpInfo & remove OpInfo test_out skips {cat, stack, hstack, vtack, dstack} (#62560 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/61767 ## Changes - [x] Add `torch.concat` alias to `torch.cat` - [x] Add OpInfo for `cat`/`concat` - [x] Fix `test_out` skips (Use `at::native::resize_output` or `at::native::resize_output_check`) - [x] `cat`/`concat` - [x] `stack` - [x] `hstack` - [x] `dstack` - [x] `vstack`/`row_stack` - [x] Remove redundant tests for `cat`/`stack` ~I've not added `cat`/`concat` to OpInfo `op_db` yet, since cat is a little more tricky than other OpInfos (should have a lot of tests) and currently there are no OpInfos for that. I can try to add that in a subsequent PR or maybe here itself, whatever is suggested.~ Edit: cat/concat OpInfo has been added. Note: I've added the named tensor support for `concat` alias as well, maybe that's out of spec in `array-api` but it is still useful for consistency in PyTorch. Thanks to krshrimali for guidance on my first PR :)) cc mruberry rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi heitorschueroff krshrimali Pull Request resolved: https://github.com/pytorch/pytorch/pull/62560 Reviewed By: saketh-are Differential Revision: D30762069 Pulled By: mruberry fbshipit-source-id: 6985159d1d9756238890488a0ab3ae7699d94337	2021-09-06 23:57:18 -07:00
Michael Dagitses	b737629ff0	simplify op name determination into a single forward pass (#64261 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64261 Note that this does not preserve byte-for-byte compatibility with existing names. Test Plan: * Rely on CI to catch gross errors. * Merge after release cut to catch subtle issues. Reviewed By: albanD Differential Revision: D30700647 Pulled By: dagitses fbshipit-source-id: 7b02f34b8fae3041240cc78fbc6bcae498c3acd4	2021-09-02 07:32:11 -07:00
Michael Dagitses	cdb46f4c6e	extract TestAutogradComplex into its own test file (#63400 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63400 This is the first step to break up test_autograd.py for #63205. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D30541499 Pulled By: dagitses fbshipit-source-id: 8d9d32007938b9eade0e88f95a6a3190e7e2ef01	2021-09-02 04:34:35 -07:00
Alban Desmaison	e322547fe6	Add forward AD support for custom Functions (#64061 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64061 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D30640868 Pulled By: albanD fbshipit-source-id: b0e6610430a879074d6d5306443772fc154b431f	2021-09-01 14:33:09 -07:00
Rohan Varma	421d8f86b6	Add a record scope around autograd::engine::evaluate_function (#63619 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63619 Adds a RECORD_FUNCTION with the function that is being valuate as part of backwards execution. This has been useful in picking up some operations in the backwards pass that otherwise would not show up, for example custom cpp functions that use custom C++ code. ghstack-source-id: 137041723 Test Plan: CI benchmark: buck run mode/opt //scripts/rvarm1/ddp:bench Reviewed By: albanD Differential Revision: D30439492 fbshipit-source-id: 955917770cdf2a2edb0303223ace710b668ba388	2021-09-01 12:32:30 -07:00
Kushashwa Ravi Shrimali	d37636901e	[Doc] `make_tensor` to `torch.testing` module (#63925 ) Summary: This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs. TODOs: * [x] Add examples cc: pmeier mruberry brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925 Reviewed By: ngimel Differential Revision: D30633487 Pulled By: mruberry fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af	2021-08-30 12:25:40 -07:00
Philip Meier	57d4c6cf42	replace `self.assertTrue(torch.allclose(..))` with `self.assertEqual(…)` (#63637 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/63565 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63637 Reviewed By: malfet Differential Revision: D30541266 Pulled By: mruberry fbshipit-source-id: ab461949782c6908a589ea098fcfcf5c3e081ee6	2021-08-25 16:47:40 -07:00
yanbing-j	33a163d886	Enable BFloat16 LeakyReLU and RReLU in CPU path (#61514 ) Summary: Enable and optimize BFloat16 LeakyReLU and RReLU in CPU path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/61514 Reviewed By: ejguan Differential Revision: D30257612 Pulled By: VitalyFedyunin fbshipit-source-id: 8cc0d1faacd02dcc9827af724a86d95b6952748f	2021-08-24 08:34:56 -07:00
Alban Desmaison	bafd875f74	Allow implementing either backward or vjp for Function (#63434 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63434 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D30431968 Pulled By: albanD fbshipit-source-id: 0bb88664283486a9fd3364e6c3d79442a44625c2	2021-08-23 07:07:11 -07:00
Victor Quach	7bad9ac78a	Fix flaky test for dp saved tensor hooks (#63324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63324 Fix for https://www.internalfb.com/tasks/?t=98258963 `catch_warnings` seem to only trigger once in certain cases where it should trigger twice. This test is only meant to test whether hooks are trigger / not trigger, so changing it to self.assertGreater is ok. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D30340833 Pulled By: Varal7 fbshipit-source-id: 1bfb9437befe9e8ab8f95efe5f513337fa9bdc5c	2021-08-17 08:56:58 -07:00
Victor Quach	5abeac3ef7	Make saved tensors default hooks thread local (#62909 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62909 This PR makes saved tensors default hooks thread local. This allows using default hooks in a multithreaded context. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D30165416 Pulled By: Varal7 fbshipit-source-id: 10a7d580661d3d94bdaf398c4e076b7bea11c16b	2021-08-13 07:49:20 -07:00
Victor Quach	ed7ece389d	Forbid inplace modification of a saved tensor's pack_hook input (#62717 ) Summary: When using saved tensors hooks (especially default hooks), if the user defines a `pack_hook` that modifies its input, it can cause some surprising behavior. The goal of this PR is to prevent future user headache by catching inplace modifications of the input of `pack_hook` and raising an error if applicable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62717 Reviewed By: albanD Differential Revision: D30255243 Pulled By: Varal7 fbshipit-source-id: 8d73f1e1b50b697a59a2849b5e21cf0aa7493b76	2021-08-12 12:40:10 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Victor Quach	557047eb4c	Add docstring for saved tensors default hooks (#62361 ) Summary: Add documentation for the saved tensors default hooks introduced in https://github.com/pytorch/pytorch/issues/61834 / https://github.com/pytorch/pytorch/issues/62563 Sister PR: https://github.com/pytorch/pytorch/issues/62362 (will add a link from autograd.rst to notes/autograd in whatever PR does not land first) Pull Request resolved: https://github.com/pytorch/pytorch/pull/62361 Reviewed By: zou3519 Differential Revision: D30081997 Pulled By: Varal7 fbshipit-source-id: cb923e943e1d96db9669c1d863d693af30910c62	2021-08-10 14:59:38 -07:00
kshitij12345	f836c4f8bd	[fix] TestMultiThreadAutograd: propagate exception from child thread to main thread (#63018 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/62895 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63018 Reviewed By: anjali411 Differential Revision: D30225856 Pulled By: Varal7 fbshipit-source-id: b5dd7999de5060e06f8958ea3ce49e0b74110971	2021-08-10 13:56:49 -07:00
Ilia Cherniavskii	773a8eede4	[profiler][refactor] Refactor the usage of legacy profiler implementation (#61931 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61931 This PR consolidates the profiling code around a new C++ implementation (profiler_kineto.h/cpp) and uses it unconditionally from torch.autograd.profiler/torch.profiler: 1. Always use profiler_kineto.h/cpp as the C++ implementation 2. Simplify profiler.py to remove unneeded parts depending on legacy impl 3. Move some of the legacy logic into profiler_legacy.py (to be fully deleted later) Test Plan: USE_KINETO=1 USE_CUDA=1 USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 python setup.py develop install --cmake python test/test_profiler.py -v USE_KINETO=0 USE_CUDA=1 USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 python setup.py develop install --cmake python test/test_profiler.py -v Imported from OSS Reviewed By: gdankel Differential Revision: D29801599 fbshipit-source-id: 9794d29f2af38dddbcd90dbce4481fc8575fa29e	2021-08-03 18:51:29 -07:00
Victor Quach	9beb279d84	Add context manager to save tensors on CPU (#61928 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61928 Fix #57100. Creates a function `torch.autograd.graph.set_save_on_cpu_hooks()` which can be used to register default hooks under which all tensors saved during the forward pass are actually copied* to cpu, then copied back to the appropriate device for the backward pass. *If the tensor was already on cpu, the entire operation is a no op. If the tensor is on GPU, we copy the tensor to `pin_memory` during packing so that the unpacking can be done asynchronously. See [benchmark](https://github.com/pytorch/pytorch/pull/61928#issuecomment-885089279) and [note about training large models](https://github.com/pytorch/pytorch/pull/61928#issuecomment-887009448) Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29848526 Pulled By: Varal7 fbshipit-source-id: 3d289cddd4fa377bd4884ba0d569fa47c777d9e5	2021-08-03 13:08:37 -07:00
Victor Quach	b161ac541d	[reland] Add default Saved Variable hooks (#62563 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62563 Expose a pair of functions to Python users: torch.autograd.graph.set_saved_tensors_default_hooks(pack, unpack) and torch.autograd.graph.reset_saved_tensors_default_hooks(). These functions control the hooks applied to saved tensors: all tensors saved in that context will be packed using the pack function, then unpacked accordingly when needed. Currently, this works by simply calling register_hooks (cf #60975) directly at the end of the constructor of a SavedVariable. This could be optimized further by not performing the copy before registering default hooks, but this would require a small refactor. Edit: the refactor is done in #61927. A current limitation is that if users create tensors in this context, they will not be able to register additional hooks on the saved tensor. For instance, to perform something like #28997, one could define a pack function that saves to disk whenever the tensor size is too big and returns a filename, then unpack simply reads the content of the file and outputs a tensor, e.g.: ``` def pack(x): name = os.path.join(tmp_dir, str(uuid.uuid4())) torch.save(x, name) return name def unpack(name): return torch.load(name) ``` Relanding previous PR: https://github.com/pytorch/pytorch/pull/61834 Original PR led to timeout error in: https://www.internalfb.com/mast/job/yuguo-release_canary_offline_training-inlinecvrp_a-canary_offline_train_28a7ecfc Now passing: https://www.internalfb.com/mast/job/quach-release_canary_offline_training-inlinecvrp_a-canary_offline_train_9bb57e98 The difference with the new version is we don't need to acquire the GIL when calling `PyDefaultSavedVariableHooks::get_hooks`. Test Plan: Imported from OSS Reviewed By: iramazanli Differential Revision: D30045405 Pulled By: Varal7 fbshipit-source-id: 7f6c07af3a56fe8835d5edcc815c15ea4fb4e332	2021-08-02 11:30:26 -07:00
kshitij12345	cb626da145	[fix] mark non-differentiable ops (#62529 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/62506 Fixes https://github.com/pytorch/pytorch/issues/62504 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62529 Reviewed By: albanD Differential Revision: D30032665 Pulled By: malfet fbshipit-source-id: 90254c50fb4a873e3eda59c8484626137e01cb31	2021-08-02 09:40:45 -07:00
Yu Guo	5c47038d12	Back out D29792193 "Add default Saved Variable hooks" (#62415 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62415 test error Differential Revision: D29990361 fbshipit-source-id: 99c87dec6c5be6496c9db5c9205c3cb72a953dd9	2021-07-29 16:31:00 -07:00
albanD	4c3eea26bd	Fix out= variant forward grad detection (#60499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60499 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29914595 Pulled By: albanD fbshipit-source-id: c51bb3aed91ab1f6ebc57936143b249590a43bd5	2021-07-27 13:06:45 -07:00
Victor Quach	be17d6eadf	Add default Saved Variable hooks (#61834 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61834 Expose a pair of functions to Python users: torch.autograd.graph.set_saved_tensors_default_hooks(pack, unpack) and torch.autograd.graph.reset_saved_tensors_default_hooks(). These functions control the hooks applied to saved tensors: all tensors saved in that context will be packed using the pack function, then unpacked accordingly when needed. Currently, this works by simply calling register_hooks (cf #60975) directly at the end of the constructor of a SavedVariable. This could be optimized further by not performing the copy before registering default hooks, but this would require a small refactor. Edit: the refactor is done in #61927. A current limitation is that if users create tensors in this context, they will not be able to register additional hooks on the saved tensor. For instance, to perform something like #28997, one could define a pack function that saves to disk whenever the tensor size is too big and returns a filename, then unpack simply reads the content of the file and outputs a tensor, e.g.: ``` def pack(x): name = os.path.join(tmp_dir, str(uuid.uuid4())) torch.save(x, name) return name def unpack(name): return torch.load(name) ``` Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D29792193 Pulled By: Varal7 fbshipit-source-id: 33e931230ef59faa3ec8b5d11ef7c05539bce77c	2021-07-26 08:14:32 -07:00
Philip Meier	10ccc5a81c	remove `randn?` from `torch.testing` namespace (#61840 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61840 Redo of #60859. Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29871017 Pulled By: mruberry fbshipit-source-id: 47afed1dc6aa0bb1e826af616ef5d5aaabb8e5bb	2021-07-23 11:51:03 -07:00
Nikita Shulga	604f503d30	Revert D29794958 + compilation fix (#61937 ) Summary: This PR un-reverts https://github.com/pytorch/pytorch/issues/61475 + fixes compilation with MSVC, that does not recognize alternative operator spellings (i.e. using `or` instead of `\|\|` ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/61937 Reviewed By: albanD Differential Revision: D29805941 Pulled By: malfet fbshipit-source-id: 01e5963c6717c1b44b260300d87ba0bf57f26ce9	2021-07-20 18:14:45 -07:00
Nikita Shulga	22fff61f06	Revert D29794958: [pytorch][PR] changing trapz to trapezoid Test Plan: revert-hammer Differential Revision: D29794958 (`95cec8f4fa`) Original commit changeset: 60b9c07efd47 fbshipit-source-id: 2dcda2d62e01c2521a86ae5ed8246cfb686d3f64	2021-07-20 16:00:46 -07:00
Kevin Tse	95cec8f4fa	changing trapz to trapezoid (#61475 ) Summary: This PR resolves issue https://github.com/pytorch/pytorch/issues/52606 while also adding support for complex number Stack from [ghstack](https://github.com/ezyang/ghstack): * https://github.com/pytorch/pytorch/issues/61616 * https://github.com/pytorch/pytorch/issues/61615 * https://github.com/pytorch/pytorch/issues/61475 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61475 Reviewed By: mruberry Differential Revision: D29794958 Pulled By: NivekT fbshipit-source-id: 60b9c07efd47fd85b9c8178768fc7828d7b57d29	2021-07-20 15:25:55 -07:00
Victor Quach	ff82394fc0	Apply saved tensor hooks (#60975 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60975 Fixes #58512 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29466227 fbshipit-source-id: c1498d52173aceb29638b5c4f521ac05356a5958	2021-07-18 08:42:51 -07:00
Victor Quach	ee5a97de11	Register Saved Tensors hooks (#60663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60663 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29466223 fbshipit-source-id: 65dc3a935c18a0e6b93a37e24543c696e6ae0321	2021-07-15 08:09:55 -07:00
Peter Bell	429436edbd	Avoid complex-to-real cast warning in CopyBackward (#60021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60021 Dropping the imaginary component is expected and gives the correct gradient formula, so silencing the warning is appropriate. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D29589371 Pulled By: mruberry fbshipit-source-id: 73e1511cae69207dc9abe576e2769ee1d03f1bbd	2021-07-07 15:28:38 -07:00
Victor Quach	5b44d817fb	Expose raw saved tensors for codegen functions (#60565 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60565 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D29466225 fbshipit-source-id: 77eb4214a1baecc501282413d99d55f8935dc01f	2021-07-01 11:25:21 -07:00
Victor Quach	a5e2ea4345	Add noop register hook (#60685 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60685 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29466224 fbshipit-source-id: 68c8aa022ccffeefd45062f1443d15c9a6824f3d	2021-06-30 07:46:34 -07:00
Victor Quach	f54290fd72	Expose raw saved tensors for custom functions (#60551 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60551 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D29466228 fbshipit-source-id: 7565f6cc3f2488c7e444cf81c7eb37a60c75b0e8	2021-06-29 17:21:52 -07:00
Xiong Wei	7e3a694b23	supports non-leaf inputs for autograd.backward() function (#60521 ) Summary: Close https://github.com/pytorch/pytorch/issues/60268 Pull Request resolved: https://github.com/pytorch/pytorch/pull/60521 Reviewed By: ngimel Differential Revision: D29393586 Pulled By: albanD fbshipit-source-id: 2dd2de427ecfecca8d544237bacf690e0b7c918c	2021-06-25 18:57:26 -07:00
Jeffrey Wan	b34965435d	Improve testing of inplace views (#59891 ) Summary: Partially addresses https://github.com/pytorch/pytorch/issues/49825 by improving the testing - Rename some of the old tests that had "inplace_view" in their names, but actually mean "inplace_[update_]on_view" so there is no confusion with the naming - Adds some tests in test_view_ops that verify basic behavior - Add tests that creation meta is properly handled for no-grad, multi-output, and custom function cases - Add test that verifies that in the cross dtype view case, the inplace views won't be accounted in the backward graph on rebase as mentioned in the issue. - Update inference mode tests to also check in-place Pull Request resolved: https://github.com/pytorch/pytorch/pull/59891 Reviewed By: albanD Differential Revision: D29272546 Pulled By: soulitzer fbshipit-source-id: b12acf5f0e3f788167ebe268423cdb58481b56f6	2021-06-22 12:28:09 -07:00
Michael Dagitses	91451369ed	require non-empty inputs to grad() calls in the API (#52016 ) Summary: The grad() function needs to return the updated values, and hence needs a non-empty inputs to populate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52016 Test Plan: Passes Python and C++ unit tests, and added new tests to catch this behavior. Fixes https://github.com/pytorch/pytorch/issues/47061 Reviewed By: albanD Differential Revision: D26406444 Pulled By: dagitses fbshipit-source-id: 023aeca9a40cd765c5bad6a1a2f8767a33b75a1a	2021-06-22 10:10:58 -07:00
albanD	8a839c5478	Fix saved variable unpacking version counter (#60195 ) Summary: We only set the value and not the actual VC. This means that in the context of double backward, if that saved tensor is saved again and the original Tensor is modified inplace, we would not detect it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60195 Reviewed By: Varal7 Differential Revision: D29208766 Pulled By: albanD fbshipit-source-id: 81175f8e3f111f89524f8e46f47577b2ea4fc945	2021-06-18 04:36:46 -07:00
Victor Quach	1efa863837	Avoid un-necessary unwrapping of Tensor in SavedVariable (#59837 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59837 Fixes #58500 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29069215 fbshipit-source-id: 603db3c8a64b729e86385ed774825f01c6ce0f20	2021-06-16 16:43:04 -07:00
Michael Carilli	be038d8989	[CUDA graphs] Make stream semantics of backward calls consistent with other cuda ops (ci-all edition) (#57833 ) Summary: ci-all resubmit of https://github.com/pytorch/pytorch/pull/54227. Tests look good except for a few distributed autograd failures (pytorch_linux_xenial_cuda10_2_cudnn7_py3_multigpu_test) and rocm failures (pr/pytorch-linux-bionic-rocm4.1-py3.6). The common denominator in rocm failures appears to be multi-gpu activity: some [multiprocess DDP failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test1/8115/console), some [single-process failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test2/8115/console) where the single process has autograd ops that span devices. jeffdaily jithunnair-amd sunway513, could one of you take a look? The streaming backward change is also beneficial to rocm, I expect. For debugging rocm failures, I think we should ignore the multiprocess/DDP tests and focus on the single process cases. The root cause is probably the same and the single process cases are simpler. ---------------------------------- Update: Rocm failures are due to https://github.com/pytorch/pytorch/issues/59750. `2718a54032` is a workaround, to be updated once https://github.com/pytorch/pytorch/issues/59750 is fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57833 Reviewed By: mruberry Differential Revision: D28942391 Pulled By: ngimel fbshipit-source-id: d6047e971c5f1c6386334bf3641402a92f12e2f8	2021-06-13 12:09:56 -07:00
albanD	e6110d4d5d	Fix input_buffer check if inplace update is valid (#59817 ) Summary: Fixes an issue introduced in https://github.com/pytorch/pytorch/issues/17182 Pull Request resolved: https://github.com/pytorch/pytorch/pull/59817 Reviewed By: bdhirsh Differential Revision: D29040738 Pulled By: albanD fbshipit-source-id: 67fd4e9fa0dadf507ddd954d20e119d8781c4de0	2021-06-11 07:29:03 -07:00
Victor Quach	0fa3db5594	Fix subgradient for element-wise max and min (#59669 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59669 Fixes #56734 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D28975531 fbshipit-source-id: 4e774dc8c6e095bc66962ce2411466de3880c2d3	2021-06-09 15:21:45 -07:00
Jeffrey Wan	1733d10399	Warn when backward() is called with create_graph=True (#59412 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/4661 - Add warnings in engine's `execute` function so it can be triggered through both cpp and python codepaths - Adds an RAII guard version of `c10::Warning::set_warnAlways` and replaces all prior usages of the set_warnAlways with the new one Pull Request resolved: https://github.com/pytorch/pytorch/pull/59412 Reviewed By: jbschlosser Differential Revision: D28969294 Pulled By: soulitzer fbshipit-source-id: b03369c926a3be18ce1cf363b39edd82a14245f0	2021-06-08 17:19:04 -07:00
Victor Quach	5fc105b323	Raise NotImplementedError on forward passes (#59483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59483 ... for functions that are not implemented Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D28933806 fbshipit-source-id: dadae1af6609f15419cf0f47a98361dc87dff849	2021-06-08 14:03:19 -07:00
Victor Quach	c268eefe96	Use TORCH_CHECK_NOT_IMPLEMENTED for AD not implemented (#59482 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59482 Fixes #53398 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D28933809 fbshipit-source-id: 53387ec9690fc235b0622b50800feced706ea1ee	2021-06-08 14:02:04 -07:00
Mike Ruberry	de40c8e495	Adds remaining OpInfos and removes redundant test generators (#55558 ) Summary: Per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55558 Reviewed By: ngimel Differential Revision: D28922522 Pulled By: mruberry fbshipit-source-id: 89cefd93788bc8aa0683f4583cf5caa81aa2dc93	2021-06-06 14:52:26 -07:00
anjali411	3607478ecd	Conjugate View (#54987 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987 Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype: Here's a summary of the changes in this PR: This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose). 1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor. 2. NEW API: a) `.conj()` -- now returning a view. b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory. c) `.conj_physical_()`, and `out=` variant d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0. e) `.resolve_conj_()` in-place version of (d) f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors. g) `view_as_real` -- existing function, but now errors out on conjugated tensors. 3. Conjugate Fallback a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor. b) This fallback is well equipped to handle the following cases: - functional operation e.g., `torch.sin(input)` - Mutable inputs and in-place operations e.g., `tensor.add_(2)` - out-of-place operation e.g., `torch.sin(input, out=out)` - Tensorlist input args - NOTE: Meta tensors don't work with conjugate fallback. 4. Autograd a) `resolve_conj()` is an identity function w.r.t. autograd b) Everything else works as expected. 5. Testing: a) All method_tests run with conjugate view tensors. b) OpInfo tests that run with conjugate views - test_variant_consistency_eager/jit - gradcheck, gradgradcheck - test_conj_views (that only run for `torch.cfloat` dtype) NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit. Follow up work: 1. conjugate view RFC 2. Add neg bit to re-enable view operation on conjugated tensors 3. Update linalg functions to call into specialized functions that fast path with the hermitian operation. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28227315 Pulled By: anjali411 fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f	2021-06-04 14:12:41 -07:00
Jeffrey Wan	4ae5764d47	Add is_inference to native functions (#58729 ) Summary: Adds `is_inference` as a native function w/ manual cpp bindings. Also changes instances of `is_inference_tensor` to `is_inference` to be consistent with other properties such as `is_complex`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58729 Reviewed By: mruberry Differential Revision: D28874507 Pulled By: soulitzer fbshipit-source-id: 0fa6bcdc72a4ae444705e2e0f3c416c1b28dadc7	2021-06-04 08:59:11 -07:00
albanD	d095ec75a1	Forward AD formulas batch 2 (#57863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57863 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D28387763 Pulled By: albanD fbshipit-source-id: e1b60ab728bb05b9e3323ee0dc7e401aaf5b8817	2021-06-03 07:33:04 -07:00
albanD	e9e5588588	Improve Tensor traverse to traverse its grad_fn when possible (#58271 ) Summary: There are two main changes here: - THPVariable will actually visit their grad_fn if there are no other reference to the c++ Tensor and no other reference to the grad_fn. The critical observation compared to the existing comment (thanks Ed!) is that if we also check that the c++ Tensor object is not referenced somewhere else, we're sure that no one can change the grad_fn refcount between the traverse and the clear. - THPVariable don't need a special clear for this new cases as we're the only owner of the c++ Tensor and so the cdata.reset() will necessarily free the Tensor and all its resources. The two tests are to ensure: - That the cycles are indeed collectible by the gc Pull Request resolved: https://github.com/pytorch/pytorch/pull/58271 Reviewed By: ngimel Differential Revision: D28796461 Pulled By: albanD fbshipit-source-id: 62c05930ddd0c48422c79b03118db41a73c1355d	2021-06-01 10:27:52 -07:00

... 2 3 4 5 6 ...

1104 Commits