pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Richard Zou	5aa183d2bc	Add mechanism to disable the "saved tensors hooks" feature (#85553 ) The rationale for this is that functorch doesn't work with saved variable hooks at the moment or checkpointing and we need some way to disable it. Concretely: - there's a context manager that does the disabling - this feature is disabled on a thread-local basis - one can set an error message or use the default error message that says the feature has been disabled Since it is thread local I needed to update ATen/ThreadLocalState. To make things nicer, this PR refactors all the "saved tensors hooks" related TLS things into a single struct. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/85553 Approved by: https://github.com/soulitzer	2022-09-28 22:49:28 +00:00
Mikayla Gawarecki	afaee00fec	Add python `nested_tensor` and `as_nested_tensor` constructors in `torch.nested` (#85593 ) Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ). Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc. Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593 Approved by: https://github.com/drisspg, https://github.com/cpuhrsch	2022-09-28 20:15:02 +00:00
soulitzer	a876432aea	Expose torch._will_engine_execute_node (#84773 ) Addresses: https://github.com/pytorch/pytorch/issues/83617 This PR a way to query the TLS graph task's exec_info which is a map mapping the Node to a bool indicating whether it will be executed in the current backward pass (as determined by the inputs= argument for .grad of .backward). - this works with both custom Function nodes and normal codegened nodes - to be able to verify whether the pyobject passed is an actual node, we now store pointers to PyTypeObjects into a set on registration. - error out when .backward without inputs= to avoid silently returning True Alternatives: - not sure if it is possible to bind to Python from a raw pointer to Node. At least we wouldn't be able to use existing logic, and the Python object should only hold a weak reference to the Node. - other solutions to the motivating issue seem to require more extensive modification to the engine See the issue linked for an example of usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/84773 Approved by: https://github.com/albanD	2022-09-28 20:13:52 +00:00
Jing Xu	80b8886223	add itt unit test and docstrings (#84848 ) Add unit tests and docstrings corresponding to PR https://github.com/pytorch/pytorch/pull/63289 UT: 1. `test_profiler_emit_itt` in `test/test_autograd.py`. This test is merely intended to catch if emit_itt breaks on construction. 2. Test `torch.profiler.itt` functions in `test/test_itt.py` 3. Only testing that emit_itt runs when `record_shapes` option is enabled in `test/test_profiler.py`. Docstring: 1. add ITT related info into `docs/source/bottleneck.rst` 4. add `torch.profiler.itt` functions to `docs/source/profiler.rst` 5. add docstring to `torch.profiler.itt` functions in `torch/profiler/itt.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84848 Approved by: https://github.com/malfet	2022-09-28 01:39:58 +00:00
Thomas Viehmann	e41d758e26	Handle implicit real->complex casting for backward of stack (#84993 ) Fixes: #75852 P.S.: Yay for the PyTorch foundation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84993 Approved by: https://github.com/soulitzer	2022-09-19 21:20:34 +00:00
Ivan Yashchuk	01c54ad6de	Remove deprecated torch.eig (#70982 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982 Approved by: https://github.com/Lezcano, https://github.com/malfet	2022-09-09 21:31:57 +00:00
Sergii Dymchenko	591222f5d9	Fix use-dict-literal lint (#83718 ) Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718 Approved by: https://github.com/albanD	2022-08-24 00:26:46 +00:00
soulitzer	81843596cb	Fix view_func replay in no-grad mode (#83872 ) Fixes https://github.com/pytorch/pytorch/issues/83828 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83872 Approved by: https://github.com/albanD	2022-08-23 18:13:00 +00:00
Brian Hirsh	0c24af4985	Always allow tensor metadata changes (#83590 ) Make it so that it is valid to set metadata after detach calls, like `x.detach().resize_(...)`. This technically lifts some restrictions around `.data`. This PR means that you can now technically call `x.data.resize_(...)`, which can now directly resize `x` instead of erroring. My understanding: Before the tensor-variable merge, when `x` and `x.data` were really different tensors, you could resize `x.data` independently of `x`, and during the merge, this error was added to avoid silent confusing behavior changes. It was agreed that this error has been around long enough (several years) that it's acceptable to drop. cc @albanD @ezyang. (Ed already had a prototype PR [here](https://github.com/pytorch/pytorch/pull/83545) - I ended up making one to try to slog through test failures). Pull Request resolved: https://github.com/pytorch/pytorch/pull/83590 Approved by: https://github.com/ezyang	2022-08-19 23:30:43 +00:00
Mikayla Gawarecki	bd0ad7a84f	Add backward support for rudimentary NestedTensor.sum(dim) (#82625 ) Per offline discussion, this will be updated to use expand once expand semantics for nested tensor have been fleshed out. Next steps will be to add support for other features for forward sum mentioned on #82387 and likewise update the backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/82625 Approved by: https://github.com/albanD	2022-08-17 18:12:00 +00:00
soulitzer	31fad3926a	Add option to run anomaly mode without nan checking (#83481 ) Fixes https://github.com/pytorch/pytorch/issues/83117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83481 Approved by: https://github.com/albanD	2022-08-16 22:56:23 +00:00
soulitzer	b567742038	Add ability to register prehooks to grad_fn (#83226 ) This simply replicates the implementation of PyFunctionPostHooks Fixes https://github.com/pytorch/pytorch/issues/83120 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83226 Approved by: https://github.com/albanD	2022-08-13 00:05:07 +00:00
Mikayla Gawarecki	e3e33cfae0	Enable codegen of per-dispatch key derivative formulas in derivatives.yaml (#82801 ) `derivatives.yaml` can now take a `dispatch` entry which registers per-autograd dispatch key derivatives such as ``` name: foo(Tensor self, Tensor y) -> Tensor dispatch: Default: x: grad y: grad.expand(y.sizes()) AutogradNestedTensor: x: grad y: NestedTensor_foo_backward(grad, y) output_differentiabilty: [True] ``` However the old schema where there is no `dispatch` entry is still supported. Would greatly appreciate feedback on how to improve the testing strategy of this PR, currently have registered an aten test op in TestOps.cpp with dummy gradients in derivatives.yaml and have some tests in test_autograd.py:TestAutogradMultipleDispatch but I am not sure whether these are sufficiently rigorous. Additionally, this PR also makes the assumption that sets like [VIEW_FUNCTIONS](`ff5399e528/tools/autograd/gen_inplace_or_view_type.py (L60)`) are per-native-function and not per-native-function-and-dispatch-key. I'm not sure whether this is necessarily the case, would there ever be a situation where (e.g. a nested_tensor op is a view op but the aten function is not or vice versa?) * __->__ #82801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82801 Approved by: https://github.com/bhosmer, https://github.com/albanD	2022-08-10 19:26:29 +00:00
Jeff Daily	263c05c918	[ROCm] work-around missing hipProfilerStart/Stop (#82778 ) ### Description cudaProfilerStart and cudaProfilerStop are deprecated but exposed by torch.cuda.cudart(). HIP has corresponding functions stubbed out, hipProfilerStart and hipProfilerStop, but they return hipErrorNotSupported. Profiling in HIP is supported, but not via these deprecated APIs. See https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__PROFILER__DEPRECATED.html. These functions are indirectly used by one or more unit tests that would otherwise pass if the non-functional HIP APIs were replaced with a dummy function. ### Testing Unskipped a related unit test, run by ciflow/trunk. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82778 Approved by: https://github.com/ezyang	2022-08-08 18:25:13 +00:00
albanD	7dd795cbed	Prevent ref cycle creation in inner hook (#82776 ) Towards fixing https://github.com/pytorch/pytorch/issues/82482 This PR fixes two things: ## 1) memory leak The .detach() call prevents a true memory leak in some cases where the user function is using multiple ops in a row that save their inputs. The following chain of objects keep each other alive - the `storage` object - a recomputed Tensor y - y's grad_fn FooBackward (in c++) - FooBackward's SavedVariables (in c++) - SavedVariable Hook - the `inner_pack` function - captures `storage` Since part of this cycle is in c++, the python gc is not able to break it. Should THPCppFunction_traverse actually visit it's SavedVariables which in turn should visit their hooks? I think the answer is yes but I haven't dived into which python object is traversing what as if there is non-unique ownership of the c++ object, it makes the traversal a lot trickier. @ezyang do you think we should dive into this more? In this case, this can be easily solved anyways by storing `y.detach()` in the `storage` object as we don't care about the temporary backward graph that gets created during the second forward call. ## 2) Lifetime of the recomputed buffers The new storage system is now such that the lifetime of the recomputed buffer is directly linked to the SavedVariable c++ object. Meaning that this buffer will get deleted IIF the SavedVariable is cleared. This means that we now get the exact same behavior as the version without the saved variable hook where Tensors are saved directly on the SavedVariable object. This is great as this solves all the cases where the non-checkpoint version used to work but the checkpoint version does not (even double access or retain_graph=True). The one drawback of this approach though is that the buffer do NOT get cleared when the user passes in `retain_graph=True`! The next backward won't even re-run the forward as it already has all the buffers available. Is this a problem that you think we would need to find a solution for @rohan-varma or it is niche enough that we don't care for now? Pull Request resolved: https://github.com/pytorch/pytorch/pull/82776 Approved by: https://github.com/ezyang, https://github.com/rohan-varma	2022-08-06 00:31:22 +00:00
soulitzer	1cafb1027f	Fix leak when create_graph and full backward hook registered (#82788 ) Fixes #82528 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82788 Approved by: https://github.com/albanD	2022-08-05 15:35:36 +00:00
Rohan Varma	98cad3d305	[Checkpoint] Fix autocasting (#81766 ) Add support for the correct autocasting in the non-reentrant checkpoint as it exists in the reentrant-version. This was noticed by @awgu. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81766 Approved by: https://github.com/albanD	2022-07-22 21:33:56 +00:00
soulitzer	f69768fed4	[forward ad] Fix codegen to ignore undefined outputs (#81114 ) I don't think there's a way to avoid functions returning undefined tensors as outputs, so codegen will have to detect them before calling _set_fw_grad. Alternatively, we can just make calling _set_fw_grad with undefined self a no-op, but I'm biasing toward keeping _set_fw_grad more strict in case it is called in other areas. Fixes https://github.com/pytorch/pytorch/issues/81111 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81114 Approved by: https://github.com/albanD	2022-07-11 15:01:39 +00:00
soulitzer	b69a2546f4	[forward ad] Skip some metadata checks for 0 numel tensor (#81055 ) Fixes https://github.com/pytorch/pytorch/issues/80507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81055 Approved by: https://github.com/ngimel	2022-07-11 15:01:39 +00:00
Rohan Varma	e14941ef79	Add kwarg support for no_reentrant checkpoint (#80987 ) Supports kwargs input to function when `torch.utils.checkpoint` with use_reentrant=False. This is required to unblock T5 activation checkpointing and MetaSeq use cases. Closes https://github.com/pytorch/pytorch/issues/79887 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80987 Approved by: https://github.com/zhaojuanmao	2022-07-09 05:07:13 +00:00
soulitzer	516f3198d6	Fix retains grad behavior after in-place (#79996 ) See this doc: https://docs.google.com/document/d/1KiRdnoj6B4cI3yl017hTbCqcOGO1gWIpUf20sldipHM/edit# Two issues (1) regarding hooks in general and (2) regarding retains grad hooks are fixed, Python hooks, which rely on a different mechanism are not discussed here: - Hooks in cpp in general - (fixed) new hooks to registered to a newer version of the tensor no longer get applied to grad_fn associated with older version of the tensor when the first hook was ever registered - (unchanged) hooks registered to the older version of the tensor remain active on - Retains grad hooks - (fixed) now get moved to the latest grad_fn. NB: To the user, retains_grad is not considered hooks or expected to behave like hooks (which we consider properties of the grad_fn) vs retains_gradness which is a property of the tensor. - (not in this PR) Python hooks - (will fix) same issue as hooks in cpp where new hooks are being applied to grad_fn associated with the older version of the tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/79996 Approved by: https://github.com/albanD	2022-07-08 19:13:28 +00:00
soulitzer	ea987086fc	Fix test_gradcheck_forward_ad_respects_requires_grad for slow gradcheck (#80401 ) Tested locally Pull Request resolved: https://github.com/pytorch/pytorch/pull/80401 Approved by: https://github.com/albanD	2022-06-28 13:51:44 +00:00
PyTorch MergeBot	a2d159e6e2	Fix forward AD copy_ into same-sized tensor without fw grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/79653 Approved by: https://github.com/albanD	2022-06-17 18:55:32 +00:00
drisspg	b9f83cb737	use is_same_size in autograd init (#79553 ) Broke: #79446 into a smaller commit that just adds is_same_size to the the autograd __init_file. This function is_same_size will be dispatched to the original behavior for regular tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/79553 Approved by: https://github.com/soulitzer	2022-06-15 19:49:42 +00:00
Rohan Varma	44fe851feb	[WIP] Fix non-reentrant hooks based checkpointing Pull Request resolved: https://github.com/pytorch/pytorch/pull/78752 Approved by: https://github.com/albanD	2022-06-14 01:13:33 +00:00
soulitzer	99ffeff949	[forward ad] Sync conj for between primal and tangent on set forward grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/78358 Approved by: https://github.com/Lezcano, https://github.com/zou3519	2022-06-08 04:20:17 +00:00
yuguo68	efdb4192bc	set data permits requires_grad=True on integer tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/78436 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-06-01 15:56:32 +00:00
soulitzer	c88367442d	[forward ad] forbid non-float non-complex tangent and primal Pull Request resolved: https://github.com/pytorch/pytorch/pull/78361 Approved by: https://github.com/albanD	2022-05-31 20:58:19 +00:00
Elias Ellison	678213ead2	Fake Tensor Part 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77969 Approved by: https://github.com/ezyang	2022-05-31 16:20:35 +00:00
Taylor Robie	e17f14fab2	[Profiler] Propagate metadata into `Engine::evaluate_function` event. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77696 https://github.com/pytorch/pytorch/pull/63619 added a RECORD_FUNCTION guard to make calls to `Engine::evaluate_function` visible regardless of the underlying op. While useful, this creates a call that looks like a forward call that somewhat complicates stitching forward and backward ops. I don't want to add complexity (and therefore work) on the hot path; instead it's fairly straightforward to stitch things back together in post. This PR simply propagates sequence number and forward tid info up to the `evaluate_function` event. Differential Revision: [D36302562](https://our.internmc.facebook.com/intern/diff/D36302562/) Approved by: https://github.com/aaronenyeshi	2022-05-22 22:39:13 +00:00
Alban Desmaison	090eddf1c7	Fix MPS interaction with autograd engine Pull Request resolved: https://github.com/pytorch/pytorch/pull/77644 Approved by: https://github.com/kulinseth, https://github.com/soulitzer, https://github.com/seemethere	2022-05-17 21:26:16 +00:00
Mikayla Gawarecki	7ba4e124e6	Bugfix gradient formula for index_reduce('prod') + separate out sample_inputs for index_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/77382 Approved by: https://github.com/cpuhrsch	2022-05-16 18:43:57 +00:00
soulitzer	beb405035c	Update forward AD metadata check to skip stride check when size is 0 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77269 Approved by: https://github.com/albanD	2022-05-16 15:53:17 +00:00
Kulin Seth	e011a8e18b	Enable PyTorch operations on MPS Backend. (#77343 ) Add PyTorch operations to MPS backend. - https://github.com/pytorch/pytorch/issues/77394 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343 Approved by: https://github.com/albanD	2022-05-13 18:28:53 +00:00
Mikayla Gawarecki	465e0ae266	Bugfix scatter_reduce backward formulas Pull Request resolved: https://github.com/pytorch/pytorch/pull/76523 Approved by: https://github.com/albanD	2022-05-05 20:22:39 +00:00
Xiaodong Wang	2291960d3f	Back out "record_function: update to use custom_class API" (#76253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76253 We're observing large QPS regression on the original PR https://github.com/pytorch/pytorch/pull/72302. For the training job we had, it regressed from 720k QPS to 450k QPS (see the test plan in FB internal). We suspect this is because the api was changed from `_record_function_enter` to `_record_function_enter_new`, and we're running experiments to confirm that. Will add more details when the runs in the test plan has finished. For now, it's better to revert the diff to unblock internal usecases and we can think about how to reland this diff later. Original commit changeset: dc9939f1fa6d Original Phabricator Diff: D35257354 Test Plan: on trunk: f338665947 with this diff: f338502850 Reviewed By: malfet, robieta Differential Revision: D35853300 fbshipit-source-id: dd38042aeacb848f66756491a4c849c7c652a0e1	2022-04-26 17:49:57 -04:00
Alban Desmaison	eb69e8a3ed	Revert "Revert "record_function: update to use custom_class API"" This reverts commit `3f9f35b9f8`. This should be done via a clean revert as this has been in master for a long time. Doing a quick fix here to make sure we don't break master. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76172 Approved by: https://github.com/atalman	2022-04-21 14:18:28 +00:00
PyTorch MergeBot	3f9f35b9f8	Revert "record_function: update to use custom_class API" This reverts commit `5630c5ac75`. Reverted https://github.com/pytorch/pytorch/pull/72302 on behalf of https://github.com/atalman	2022-04-21 13:59:48 +00:00
albanD	cd0591dff3	Change default TLS behavior in dispatch to favor is-a style Pull Request resolved: https://github.com/pytorch/pytorch/pull/75827 Approved by: https://github.com/ezyang	2022-04-20 17:32:29 +00:00
Peter Bell	cc56fac213	Fix complex to real casting warning in _to_copy backward Fixes #75781 A Real->Complex cast should result in a gradient with no imaginary component, so discarding the imaginary component is expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75805 Approved by: https://github.com/albanD	2022-04-19 14:04:13 +00:00
Ivan Yashchuk	38a758e251	Add forward AD for rsub, polar, and FFT This PR adds forward AD support for: - torch.rsub - tensor.\_\_rsub\_\_ - torch.polar - torch.fft.fft - torch.fft.fft2 - torch.fft.fftn - torch.fft.hfft - torch.fft.hfft2 - torch.fft.hfftn - torch.fft.rfft - torch.fft.rfft2 - torch.fft.rfftn - torch.fft.ifft - torch.fft.ifft2 - torch.fft.ifftn - torch.fft.ihfft - torch.fft.ihfft2 - torch.fft.ihfftn - torch.fft.irfft - torch.fft.irfft2 - torch.fft.irfftn - torch.stft - torch.istft Ref. https://github.com/pytorch/pytorch/issues/71117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75326 Approved by: https://github.com/soulitzer	2022-04-08 05:01:01 +00:00
Ivan Yashchuk	65ed1e3526	Add forward AD for torch.atan2 This PR adds a formula for the total differential of the atan2 function. Ref. https://github.com/pytorch/pytorch/issues/71117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75027 Approved by: https://github.com/soulitzer	2022-04-01 05:24:19 +00:00
Nikita Shulga	bfac65dfe5	[testing] Update dispatch macros (#74977 ) This PR is reland of #74289 Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>	2022-03-30 14:13:21 -07:00
PyTorch MergeBot	2e4152b118	Revert "[testing] Update dispatch macros" This reverts commit `eed19a0f38`. Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet	2022-03-30 19:52:37 +00:00
Khushi Agrawal	eed19a0f38	[testing] Update dispatch macros Hi, This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that). Please take a look. Thanks! cc: @pmeier @mruberry @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289 Approved by: https://github.com/pmeier, https://github.com/mruberry	2022-03-30 16:10:16 +00:00
Peter Bell	5630c5ac75	record_function: update to use custom_class API Merge after forward-compatibility period is over. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72302 Approved by: https://github.com/albanD	2022-03-30 15:57:28 +00:00
Kurt Mohler	5375b2e994	Resolve `int[]?` arguments to new OptionalIntArrayRef class This PR uses the `OptionalArrayRef` template class that was drafted in #64084. Fixes #44409 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70864 Approved by: https://github.com/ezyang	2022-03-26 01:45:50 +00:00
Richard Zou	a75c718d7c	[reland] Update tls logic to work better with guarded call (#73925 ) This PR relands https://github.com/pytorch/pytorch/pull/73925 which we reverted due to a large breakage in functorch. As a part of the reland, this PR adds a change we agreed upon in https://docs.google.com/document/d/1i7Y9VZp9PxtgVcrQh6nGQXkXkPc1uMep0dM-OMOGJ9o/edit The change is moving the PythonTLSSnapshot key after DynamicLayerFrontMode. Test Plan: - I tested this with an updated version of functorch and all the tests pass so I think we are out of the woods. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74577 Approved by: https://github.com/albanD	2022-03-25 19:51:10 +00:00
Richard Zou	a9d9f91f31	Revert "Update tls logic to work better with guarded call (#73925 )" This reverts commit `dff02851d1`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74268 Approved by: https://github.com/albanD	2022-03-16 17:00:19 +00:00
Nikita Shulga	ef066f0832	Revert D34856571: [pytorch][PR] Replace `get_all_` type macros with the ATen dispatch macros. Test Plan: revert-hammer Differential Revision: D34856571 (`3ded7b1da3`) Original commit changeset: 0dca038bcad5 Original Phabricator Diff: D34856571 (`3ded7b1da3`) fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386 (cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)	2022-03-15 22:07:11 +00:00
Khushi Agrawal	3ded7b1da3	Replace `get_all_` type macros with the ATen dispatch macros. (#71561 ) Summary: Hi, Team! The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros. The files it iterates over are: (Thanks, Lezcano, for the idea!!) <details> <summary> `test/test_autograd.py`</summary> <p> ```python 43:from torch.testing._internal.common_dtype import get_all_dtypes 8506: floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point] ``` </p> </details> <details> <summary> `test/test_binary_ufuncs.py`</summary> <p> ```python 26: all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes, 27: get_all_complex_dtypes, get_all_fp_dtypes, 935: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1035: dtypes(get_all_dtypes( 1488: dtypes((get_all_dtypes(include_bool=False, include_bfloat16=False))) 1879: dtypes(product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False))) 1887: dtypes((get_all_int_dtypes() + [torch.bool])) 1913: dtypes((get_all_fp_dtypes())) 1941: dtypes((get_all_fp_dtypes())) 1977: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 2019: dtypes(product(get_all_fp_dtypes(), get_all_fp_dtypes())) 2048: dtypes(get_all_dtypes()) 2110: dtypes(product(get_all_dtypes(include_complex=False), 2111: get_all_dtypes(include_complex=False))) 2128: types = [torch.bool, torch.bfloat16] + get_all_int_dtypes() 2173: if dtypes[1] in get_all_fp_dtypes(): 2178: dtypes(product(get_all_fp_dtypes(), 2179: get_all_fp_dtypes())) 2260: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2261: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2273: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2274: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2307: dtypes(get_all_math_dtypes('cpu')) 2319: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 2331: dtypes(get_all_int_dtypes()) 2356: dtypes(get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False)) 2393: if dtype in get_all_int_dtypes(): 2614: dtypes(get_all_dtypes()) 2624: dtypes(tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2))) 2806: dtypes(list(product(get_all_dtypes(include_complex=False), 2807: get_all_dtypes(include_complex=False)))) 2866: dtypes(list(product(get_all_complex_dtypes(), 2867: get_all_complex_dtypes()))) 2902: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2906: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2910: dtypes(product(get_all_dtypes(), get_all_dtypes())) 3019: dtypes = [torch.float, torch.double] + get_all_complex_dtypes() 3221: dtypes(get_all_dtypes(include_complex=False)) 3407: dtypes(list(product(get_all_dtypes(include_bool=False), 3408: get_all_dtypes(include_bool=False)))) 3504: dtypes(product(get_all_dtypes(include_complex=False, include_bfloat16=False), 3505: get_all_dtypes(include_complex=False, include_bfloat16=False))) 3516: if x.dtype in get_all_int_dtypes() + [torch.bool]: 3643: dtypes(product(get_all_dtypes(include_complex=False, 3645: get_all_dtypes(include_complex=False, ``` </p> </details> <details> <summary> `test/test_complex.py`</summary> <p> ```python 6:from torch.testing._internal.common_dtype import get_all_complex_dtypes 11: dtypes(get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_foreach.py`</summary> <p> ```python 18: get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 142: if dtype in get_all_int_dtypes(): 179: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 201: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 205: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 211: disable_fastpath \|= dtype not in get_all_complex_dtypes() 241: bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 246: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 248: disable_fastpath \|= dtype not in get_all_complex_dtypes() 250: disable_fastpath \|= True and dtype not in get_all_complex_dtypes() 307: disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool] 365: if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes(): 376: ops(foreach_unary_op_db, dtypes=get_all_dtypes()) 393: dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False)) 401: ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True)) 426: if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes(): 439: dtypes(get_all_dtypes()) 449: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 481: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 536: if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div: 545: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 637: ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False)) ``` </p> </details> <details> <summary> `test/test_linalg.py`</summary> <p> ```python 29: all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, 30: get_all_fp_dtypes, 111: dtypes((get_all_dtypes())) 794: float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes() 807: dtypes((get_all_int_dtypes())) 828: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 841: if dtype in get_all_complex_dtypes(): 844: dtypes(itertools.product(get_all_dtypes(), 845: get_all_dtypes())) 855: for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3): 5607: get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater))) 5608: dtypes((set(get_all_dtypes()) - {torch.half, torch.bool})) 5644: dtypes((get_all_complex_dtypes() + get_all_fp_dtypes())) 6255: dtypesIfCUDA(get_all_complex_dtypes(), 6256: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)), 6292: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6323: dtypesIfCUDA(get_all_complex_dtypes(), 6324: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6325: dtypes(get_all_complex_dtypes(), get_all_fp_dtypes()) 6358: dtypesIfCUDA(([torch.float, torch.double] + get_all_complex_dtypes())) 6556: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6668: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6741: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_nn.py`</summary> <p> ```python 37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes 50: onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \ 8862: for device in get_all_device_types(): 9629: for dt1 in get_all_math_dtypes(device): 9630: for dt2 in get_all_math_dtypes(device): 9631: for dt3 in get_all_math_dtypes(device): 9648: for input_dtype in get_all_math_dtypes(device): 9664: for input_dtype in get_all_math_dtypes(device): 13015: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13034: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13159: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17400: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17768: dtypesIfCUDA(get_all_fp_dtypes()) 17773: dtypesIfCUDA(get_all_fp_dtypes()) 17778: dtypesIfCUDA(get_all_fp_dtypes()) 17783: dtypesIfCUDA(get_all_fp_dtypes()) 17788: dtypesIfCUDA(get_all_fp_dtypes()) 17793: dtypesIfCUDA(get_all_fp_dtypes()) 17798: dtypesIfCUDA(get_all_fp_dtypes()) 17963: dtypesIfCUDA(get_all_fp_dtypes()) 17977: dtypesIfCUDA(get_all_fp_dtypes()) 18684: def test_cross_entropy_loss_prob_target_all_reductions(self, device): ``` </p> </details> <details> <summary> `test/test_numpy_interop.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import get_all_dtypes 399: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_ops.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes 86: for dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_reductions.py`</summary> <p> ```python 16: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 360: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 366: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 394: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 750: for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]: 1404: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1457: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1458: get_all_complex_dtypes())) 1465: return dtype in get_all_int_dtypes() 1494: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1501: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1507: dtypes((get_all_complex_dtypes())) 1514: dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)) 1523: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1531: if dtype in get_all_fp_dtypes(): 1608: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, 1837: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1855: dtypes((set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8})) 3219: for dtype in get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_serialization.py`</summary> <p> ```python 26:from torch.testing._internal.common_dtype import get_all_dtypes 586: for device, dtype in product(devices, get_all_dtypes()): 589: for other_dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_shape_ops.py`</summary> <p> ```python 18:from torch.testing._internal.common_dtype import get_all_dtypes 230: dtypes(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, 232: dtypesIfCUDA(get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False)) 344: dtypes(get_all_dtypes()) 443: dtypes(get_all_dtypes()) 461: dtypes(get_all_dtypes()) 570: dtypes(get_all_dtypes(include_complex=False)) ``` </p> </details> <details> <summary> `test/test_sort_and_select.py`</summary> <p> ```python 12: all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, 136: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 231: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 296: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 647: dtypesIfCUDA(get_all_fp_dtypes()) 678: dtypesIfCUDA((get_all_dtypes(include_complex=False, 682: dtypes((get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False))) 739: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 740: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) 799: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 800: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) ``` </p> </details> <details> <summary> `test/test_sparse.py`</summary> <p> ```python 20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes 29: floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes, 1963: return dtype in get_all_int_dtypes() 1994: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2103: return dtype in get_all_int_dtypes() 2138: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2626: all_sparse_dtypes = get_all_dtypes(include_complex=True) 2633: all_sparse_dtypes = get_all_dtypes(include_complex=True) 3230: dtypes(get_all_complex_dtypes(), 3231: get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 3234: get_all_fp_dtypes( ``` </p> </details> <details> <summary> `test/test_sparse_csr.py`</summary> <p> ```python 7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor 17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes 120: dtypes(get_all_dtypes()) 133: dtypes(get_all_dtypes()) 150: dtypes(get_all_dtypes()) 180: dtypes(get_all_dtypes()) 201: dtypes(get_all_dtypes()) 210: dtypes(get_all_dtypes()) 225: dtypes(get_all_dtypes()) 244: dtypes(get_all_dtypes()) 263: dtypes(get_all_dtypes()) 285: dtypes(get_all_dtypes()) 411: dtypes(get_all_dtypes()) 482: dtypes(get_all_dtypes()) 502: dtypes(get_all_dtypes()) 562: dtypes(get_all_dtypes()) 588: dtypesIfCUDA(get_all_complex_dtypes(), 589: get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater)) 745: dtypesIfCUDA(get_all_complex_dtypes(), 746: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 765: dtypesIfCUDA(get_all_complex_dtypes(), 766: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 801: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 841: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 1182: dtypes(get_all_dtypes()) 1276: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False)) 1286: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_tensor_creation_ops.py`</summary> <p> ```python 21: onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types) 23: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 150: for dt in get_all_dtypes(): 160: for dt in get_all_dtypes(): 314: dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16] 1012: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1013: get_all_complex_dtypes())) 1032: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1033: get_all_complex_dtypes())) 1050: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1051: get_all_complex_dtypes())) 1745: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1779: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1868: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1926: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1954: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 1956: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None) 1957: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 2538: for device in get_all_device_types(): 2645: for dtype in get_all_dtypes(): 2678: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False) + 2679: get_all_complex_dtypes())) 2716: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 2827: for dt in get_all_dtypes(): 2913: dtypes(get_all_dtypes(include_bool=False, include_half=False)) 2914: dtypesIfCUDA(get_all_dtypes(include_bool=False, include_half=True)) 3028: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3033: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3074: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_complex=False)) 3075: dtypesIfCUDA(((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16]) 3077: else get_all_dtypes(include_bool=False, include_half=True, include_complex=False))) 3873: dtypes(get_all_dtypes()) 3884: dtypes(get_all_dtypes(include_bool=False)) 3916: for other in get_all_dtypes(): 3922: dtypes(get_all_dtypes()) 3932: dtypes(get_all_dtypes(include_bool=False)) 3955: dtypes(get_all_dtypes(include_bool=False)) 3961: dtypes(get_all_dtypes(include_bool=False)) 3965: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_testing.py`</summary> <p> ```python 25:from torch.testing._internal.common_dtype import get_all_dtypes 31: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_torch.py`</summary> <p> ```python 51: expectedAlertNondeterministic, get_all_device_types, skipXLA) 57: get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes 296: for d in get_all_device_types(): 323: for device in get_all_device_types(): 324: for dt1 in get_all_dtypes(): 325: for dt2 in get_all_dtypes(): 343: all_dtypes = get_all_dtypes() 350: all_dtypes = get_all_dtypes() 781: for dtype in get_all_dtypes(): 986: for device in get_all_device_types(): 1017: for device in get_all_device_types(): 1018: for dtype in get_all_math_dtypes(device): 2792: for device in get_all_device_types(): 3186: dtypes(get_all_dtypes()) 3195: for error_dtype in get_all_dtypes(): 3203: dtypes(get_all_dtypes()) 3212: for error_dtype in get_all_dtypes(): 4539: dtypes(get_all_fp_dtypes()) 4545: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 4577: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 4578: dtypesIfCPU((get_all_fp_dtypes(include_half=False, include_bfloat16=True))) 4579: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4599: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4600: dtypesIfCPU((get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False))) 4601: dtypesIfCUDA((get_all_dtypes(include_bfloat16=False, include_complex=False))) 4613: for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False): 4628: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4629: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4640: dtypes(get_all_fp_dtypes()) 4723: dtypes(get_all_fp_dtypes()) 4735: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 4736: dtypesIfCUDA(get_all_fp_dtypes()) 4747: dtypes(get_all_fp_dtypes()) 4761: dtypes(get_all_fp_dtypes()) 4771: dtypes(get_all_fp_dtypes()) 4792: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 5302: dtypes(get_all_dtypes(include_bfloat16=False)) 5322: dtypes(get_all_dtypes(include_half=False, include_bfloat16=False)) 5323: dtypesIfCPU(get_all_dtypes(include_bfloat16=False)) 5324: dtypesIfCUDA(get_all_dtypes(include_bfloat16=False)) 5591: for dt in get_all_dtypes(): 5611: for dt in get_all_dtypes(): 5678: for dt in get_all_dtypes(): 5696: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 5697: dtypes(set(get_all_math_dtypes('cpu'))) 5746: dtypes(get_all_dtypes()) 5780: dtypes(get_all_dtypes()) 5885: dtypes(get_all_dtypes()) 5902: dtypes(get_all_dtypes()) 5945: dtypes(get_all_dtypes()) 5979: dtypes(get_all_dtypes(include_bool=False)) 6049: dtypes(get_all_dtypes(include_bool=False)) 6092: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6093: get_all_complex_dtypes())) 6094: dtypesIfCPU(get_all_dtypes()) 6095: dtypesIfCUDA(get_all_dtypes()) 6122: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6123: get_all_complex_dtypes())) 6124: dtypesIfCPU(get_all_dtypes()) 6125: dtypesIfCUDA(get_all_dtypes()) 6163: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6164: get_all_complex_dtypes())) 6165: dtypesIfCPU(get_all_dtypes()) 6166: dtypesIfCUDA(get_all_dtypes()) 6190: dtypes((get_all_complex_dtypes() + 6191: get_all_int_dtypes())) 6238: dtypes(get_all_dtypes()) 6323: dtypes(get_all_dtypes()) 6389: dtypes(product(get_all_dtypes(), (torch.uint8, torch.bool))) 6699: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 6700: dtypes(set(get_all_math_dtypes('cpu'))) 7452: dtypes(get_all_dtypes(include_bool=False)) 7461: dtypes(get_all_dtypes(include_bool=False)) 7477: dtypes(get_all_dtypes(include_bool=False)) 7496: dtypes(get_all_dtypes(include_bool=False)) 7538: dtypes(get_all_dtypes(include_bool=False)) 8162: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8163: get_all_complex_dtypes())) 8175: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8176: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_type_promotion.py`</summary> <p> ```python 14: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes 187: for dtype in get_all_dtypes(): 262: dtypes1 = get_all_math_dtypes('cuda') 263: dtypes2 = get_all_math_dtypes(device) 339: dtypes(itertools.product(get_all_dtypes(), get_all_dtypes())) 468: for dt1 in get_all_math_dtypes(device): 469: for dt2 in get_all_math_dtypes(device): 519: for dt1 in get_all_math_dtypes(device): 520: for dt2 in get_all_math_dtypes(device): 528: for dt in get_all_math_dtypes(device): 561: for dtype in get_all_dtypes(): 766: dtypes=get_all_math_dtypes(device)) 771: dtypes=get_all_math_dtypes(device)) 782: dtypes=get_all_math_dtypes(device)) 879: dtypes = get_all_dtypes(include_bfloat16=False) 898: dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False) 965: dtypesIfCUDA(itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False), 966: get_all_dtypes(include_bfloat16=False, include_complex=False))) 967: dtypes(itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False, 969: get_all_dtypes(include_half=False, include_bfloat16=False, 976: return dtype in get_all_int_dtypes() + [torch.bool] 979: return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False) ``` </p> </details> <details> <summary> `test/test_unary_ufuncs.py`</summary> <p> ```python 24: floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes, 25: get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 517: dtypes((get_all_int_dtypes() + [torch.bool] + 518: get_all_fp_dtypes(include_bfloat16=False))) 596: dtypes(get_all_fp_dtypes(include_half=True, include_bfloat16=False)) 611: invalid_input_dtypes = get_all_int_dtypes() + \ 612: get_all_complex_dtypes() + \ 619: for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False): 1048: dtypes(get_all_math_dtypes('cpu')) 1182: dtypesIfCUDA(get_all_fp_dtypes()) 1190: dtypesIfCUDA(get_all_fp_dtypes()) 1205: dtypesIfCUDA(get_all_fp_dtypes()) 1215: dtypesIfCUDA(get_all_fp_dtypes()) 1307: dtypes((get_all_dtypes(include_bool=False))) 1349: dtypes((get_all_fp_dtypes(include_half=False) + 1350: get_all_complex_dtypes())) 1351: dtypesIfCUDA((get_all_fp_dtypes(include_half=True) + 1352: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_view_ops.py`</summary> <p> ```python 19: get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 124: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 131: dtypes(get_all_dtypes(include_bfloat16=False)) 213: for view_dtype in [get_all_fp_dtypes(), get_all_complex_dtypes()]: 220: dtypes(get_all_dtypes()) 224: for view_dtype in get_all_dtypes(): 305: dtypes(get_all_complex_dtypes(include_complex32=True)) 343: dtypes(get_all_dtypes()) 354: dtypes(get_all_dtypes()) 364: dtypes(get_all_dtypes()) 374: dtypes(get_all_dtypes()) 384: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 395: dtypes(get_all_complex_dtypes()) 426: dtypes(get_all_complex_dtypes()) 451: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 1263: dtypes((torch.testing.get_all_dtypes())) 1279: dtypes((torch.testing.get_all_dtypes())) 1405: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1406: get_all_complex_dtypes())) 1471: dtypes(get_all_dtypes(include_bfloat16=False)) 1574: dtypes(get_all_dtypes()) 1601: dtypes(get_all_dtypes(include_bfloat16=False)) 1632: dtypes(*get_all_dtypes(include_bfloat16=False)) 1711: for dt in get_all_dtypes(): 1717: for dt in get_all_dtypes(): 1724: for dt in get_all_dtypes(): ``` </p> </details> I'm looking forward to your viewpoints. Thanks :) cc: mruberry kshitij12345 anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561 Reviewed By: samdow Differential Revision: D34856571 Pulled By: mruberry fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335 (cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)	2022-03-15 20:31:41 +00:00
Duncan Hill	0988dc481a	[Codemod][Codemod deprecated unittest asserts] fbcode//caffe2/test (#71708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71708 In Python 3.2, a number of asserts were deprecated. In Python 3.11, these asserts are deleted completely. The files in this change still use the deprecated asserts. Switch over to the supported syntax for 3.2 onwards. Test Plan: Tested on the internal test suite runner. Reviewed By: ajtulloch Differential Revision: D33503694 fbshipit-source-id: a150f296033260acf8365d77b837ce0679f57361 (cherry picked from commit abf60ed97409265222915d8265aaabedd625fd93)	2022-03-15 19:28:52 +00:00
Alban Desmaison	dff02851d1	Update tls logic to work better with guarded call (#73925 ) Summary: Description of the new behavior is in PythonFallbackKernel.cpp. The updated test makes sure that we only call alias on the first Tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73925 Reviewed By: samdow Differential Revision: D34862940 Pulled By: albanD fbshipit-source-id: 4d020e41c8bb8b10262dcafd524e84a5ad4d7af0 (cherry picked from commit 0aa6b56dbd3dcee830453fb02cd6c83ab7a8be06)	2022-03-14 21:26:31 +00:00
Alban Desmaison	b2a5507654	Fix deadlock in some edge case in autograd (#73961 ) Summary: Minimal example that deadlocks before but not after: ```python import torch from torch.autograd import Function class Foo(Function): staticmethod def forward(ctx, x): return x.clone() staticmethod def forward(ctx, gO): return gO.clone() def get_out(): inp = torch.rand(2, requires_grad=True) # The python function is first so that it runs # last in the backward pass right = Foo.apply(inp) # An op that creates new memory left1 = inp.clone() # An op that saves its input left2 = left1 ** 2 # Inplace modify so that the backward for # left2 always raises an error left1 += 1 # An op that takes both side as input. # After running, both side's last op will be in # the ready queue # And the op for left will run first as it was # executed last during the forward out = left2 + right return out # Nothing should be global variables here as, from what # I can see, python leaks all the global objects get_out().sum().backward() ``` Since this requires the python interpreter to die, it is hard to test in CI. Let me know if you have an idea how to do it though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73961 Reviewed By: malfet Differential Revision: D34752747 Pulled By: albanD fbshipit-source-id: 1a537b1f733e161e8d3ff053cd432b37b34d432a (cherry picked from commit 17943e4c04c782d81deab439e010195f04e75bbd)	2022-03-09 20:42:15 +00:00
soulitzer	15df909d34	Move autograd functional tests to separate file (#73852 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73852 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D34703586 Pulled By: soulitzer fbshipit-source-id: 58e8b17ab3dc41ce7bf15bb32ea0653d90f44791 (cherry picked from commit 526ab20fd6026144171bf3b02a5381da57ca9f91)	2022-03-08 23:45:34 +00:00
Peter Bell	9ef5c679ef	record_function: add torchbind alternative API (#72301 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72301 First step in resolving #35026. This adds `PythonRecordFunction` which is a `torch::CustomClassHolder` for `at::RecordFunction` to keep the ATen code free of torch includes. And adds new unused internal API functions `_record_function_enter_new` which return the torchbind object. Once the FC period is expired, `torch.profiler.record_function` will be updated to use this new internal API. Then once BC period is expired, the cpp_custom_type_hack-based API can be removed. Test Plan: Imported from OSS Reviewed By: dagitses Differential Revision: D34586311 Pulled By: robieta fbshipit-source-id: d3eb9ffad7b348548a2b22c75203a92d1cb5115b (cherry picked from commit 92d2ca808e5fbd20c9d6645dcabc3f059f9ef2d3)	2022-03-08 03:26:27 +00:00
anjali411	086645ad77	Update __torch_dispatch__ to return op overload instead of the opoverload packet function (#72673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72673 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D34627164 Pulled By: anjali411 fbshipit-source-id: 3cb6406a392d530bf9da36b4d8e0a62b30e6497e (cherry picked from commit 65b85a0a67df4d0f16ac8964e2b685d478a610fb)	2022-03-07 22:38:42 +00:00
Philip Meier	b5f2574f36	no longer coalesce sparse COO tensors before comparison (#69751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69751 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262453 Pulled By: ezyang fbshipit-source-id: e2e62d2aa03fc569d2951c880960b256f5dc4aaa (cherry picked from commit `cb6b0ef719`)	2022-02-17 02:33:08 +00:00
Alban Desmaison	a877441494	Clean up use of cpu ready queue in autograd engine (#72688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72688 Refactor how we know what to run on the cpu queue. The Lazy Tensor moved there as it is always present as a device guard and would make the number of devices 1 all the time (forcing the creation of a thread). FYI wconstab you most likely don't care about this unless you ever use multiple Lazy device? This should slightly improve the perf if you run backward with Lazy Tensors as the work will be done in the main thread and not a worker thread. Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D34180245 Pulled By: albanD fbshipit-source-id: 88c5d5bdd631ad01bf271d720d1eab69aba84fc0 (cherry picked from commit `da7e9b902f`)	2022-02-12 01:52:56 +00:00
Ivan Yashchuk	fb7c4780f9	Add autograd tests for addmm, addmv, mm, mv and CSR matrix input (#71949 ) Summary: This PR adds autograd tests for `addmm, addmv, mm, mv` functions that check computing derivatives wrt dense inputs. Currently, neither autograd engine, nor gradcheck can work with CSR inputs<->CSR outputs. I added xfailing tests for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71949 Reviewed By: george-qi Differential Revision: D33834653 Pulled By: cpuhrsch fbshipit-source-id: 4144c1547427d4cd6b01495cf45242bb4e914e86 (cherry picked from commit `2cb362283d`)	2022-02-11 23:14:02 +00:00
soulitzer	91e4f7788c	Gradcheck forward AD respects requires grad but run with requires_grad=False (#72309 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72309 Fixes: https://github.com/pytorch/pytorch/issues/72113 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33991570 Pulled By: soulitzer fbshipit-source-id: 610de162e9848d2d3b12e0fb039860fd9dee844f (cherry picked from commit `a7ecb13610`)	2022-02-10 03:30:40 +00:00
soulitzer	e39bf13316	Fix internal assert custom function when input does not require grad (#72008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72008 Fixes #71119 Technically BC-breaking because when an input does not require grad, previously it was returned as-is instead of a view because it didn't need to. Now we will also return a view in that case (whether or not forward AD runs). Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33859553 Pulled By: soulitzer fbshipit-source-id: 81b3fa371f4c0904630878500aa190492c562367 (cherry picked from commit `ee74bc8234`)	2022-02-01 22:36:04 +00:00
Richard Zou	5735f2f875	Make detach redispatch like a regular PyTorch operator (#71707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71707 Why? - detach should behave like jax.stop_gradient in functorch. Because it does not detach all the way through, functorch (as well as a Tensor Subclass wrapping a Tensor subclass) won't see it after the first layer/subclass handles it. How? - This PR changes detach to dispatch all the way through to the backend. - This PR also modifies native::detach to call shallow_copy_and_detach instead of native::alias. This is because today, the semantics of detach and alias are differently -- they differ only by allow_tensor_metadata_change. In the future, we may choose to deprecate this flag. - NB: Before and after this PR, detach() shows up twice in torch_dispatch: https://github.com/pytorch/pytorch/issues/71725. This is not a regression so I didn't want to fix it in this PR because it is weird to fix. Test Plan: - added new tests; run existing tests Reviewed By: albanD Differential Revision: D33752860 Pulled By: zou3519 fbshipit-source-id: 40cc2dc8232e75a02586a4ba5b0ef5f16cb76617 (cherry picked from commit `f88aae426e`)	2022-01-28 16:13:36 +00:00
lezcano	84f1685397	Rewrite svd and linalg.svd as structured kernels (#69827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69827 In general, the current pattern allows for implementing optimisations for all the backends in a common place (see for example the optimisation for empty matrices). After this PR, `torch.svd` is implemented in terms of `linalg.svd` and `linalg.svdvals`, as expected. This makes it differentiable in the case when `compute_uv=False`, although this is not particularly important, as `torch.svd` will eventually be deprecated. This PR also instantiates smaller `U` / `V` when calling cusolver_gesvdj in the cases when `full_matrices=False` or `compute_uv=False`. The memory for auxiliary `U` and `V` in the cases above, needed for some cuSOLVER routines is allocated raw allocators rather than through fully fledged tensors, as it's just a blob of memory the algorithm requests. As the code is better structured now, it was easier to see that `U` and `Vh` needn't be allocated when calling `svd_cusolver_gesvd`. Now `linalg.svdvals` work as expected wrt the `out=` parameter. Note that in the test `test_svd_memory_allocation` we were passing a tensor of the wrong size and dtype and the test seemed to pass... This PR also changes the backward formula to avoid saving the input matrix, as it's not necessary. In a follow up PR, I will clean the backward formula and make it more numerically stable and efficient. This PR also does a number of memory optimisations here and there, and fixes the call to cusolver_gesvd, which were incorrect for m <= n. To test this path, I compiled the code with a flag to unconditionally execute the `if (!gesvdj_convergence_check.empty())` branch, and all the tests passed. I also took this chance to simplify the tests for these functions in `test_linalg.py`, as we had lots of tests that were testing some functionality that is already currently tested in the corresponding OpInfos. I used xwang233's feature to test both MAGMA and CUDA backends. This is particularly good for SVD, as cuSOLVER is always chosen over MAGMA when available, so testing MAGMA otherwise would be tricky. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33751983 Pulled By: mruberry fbshipit-source-id: 11d48d977946345583d33d14fb11a170a7d14fd2 (cherry picked from commit `a1860bd567`)	2022-01-27 18:38:30 +00:00
anjali411	de8d0203e9	Allow torch.Tensor.real on real-valued tensors (#71718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71718 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33770668 Pulled By: anjali411 fbshipit-source-id: bad21ebe72220b9017a0b8efa71eaeab84bd9e9f (cherry picked from commit `aa0a922757`)	2022-01-25 22:30:48 +00:00
soulitzer	7a0c97195f	Add save_for_forward to custom function (#71569 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71569 Not sure if this is the right API Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695395 Pulled By: soulitzer fbshipit-source-id: 652b5758f15d901f98ff0da94e977030c7f3415b (cherry picked from commit `9421a6846a`)	2022-01-25 07:30:46 +00:00
soulitzer	09aeadf4ab	Fix custom function forward AD internal assert (#71531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71531 Based on the comment above the original internal assert, this is the desired check. 1. Don't error, and automatically make jvp return a view for that tensor output (this is easier than I originally thought: https://github.com/pytorch/pytorch/pull/71531#discussion_r789211877) 2. Error (currently doing) Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695399 Pulled By: soulitzer fbshipit-source-id: dba49890a55ad1dd59ed5c41faa96bf7cfc9e562 (cherry picked from commit `fdb0f266f5`)	2022-01-25 07:30:46 +00:00
soulitzer	1cc3291716	Fix custom function when non tensor argument precedes tensor argument (#71530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71530 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695397 Pulled By: soulitzer fbshipit-source-id: 49ccd062f73ccf69c47aca2552fde182d582be2a (cherry picked from commit `68d502a013`)	2022-01-25 07:30:46 +00:00
Victor Quach	a3b7dd7b78	Enable nested default hooks (#70932 ) Summary: When default hooks are set, they are pushed onto a stack. When nesting context-manager, only the inner-most hooks will be applied. There is special care needed to update the TLS code. See also https://github.com/pytorch/pytorch/issues/70940 (i.e. do we need to be storing the enabled flag as well?) Fixes https://github.com/pytorch/pytorch/issues/70134 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70932 Reviewed By: mruberry Differential Revision: D33530370 Pulled By: albanD fbshipit-source-id: 3197d585d77563f36c175d3949115a0776b309f4	2022-01-11 15:03:49 -08:00
soulitzer	7397683b57	Add forward AD formulas for mv, scatter_add, _s_where (#70468 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70468 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33405364 Pulled By: soulitzer fbshipit-source-id: 7681c33fb264a7a3ec6436ebb7c5bb07cd5ffc3d	2022-01-10 13:54:10 -08:00
Mike Ruberry	84b7832010	Updates CUDA memory leak check to verify against driver API and print more diagnostic information (#69556 ) Summary: Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/69556 Reviewed By: mrshenli Differential Revision: D32954770 Pulled By: mruberry fbshipit-source-id: a6c2ae6f704422c178569980ca4b9c72c4272f55	2021-12-17 23:37:49 -08:00
soulitzer	51033ec840	Add forward AD layout check for storage numel (#68631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68631 This PR: - Adds the check that the storage numel of the base and tangent tensors are the same. This is to support the case when as_strided reveals elements that aren't indexable by the input tensor. - Skips the check when batched tensors are involved, because using as_strided to reveal elements that not indexable by the input tensor is already not allowed vmap. - Adds tests for the above two cases, as well as an edge case regarding conj bit (what about neg bit?) For functorch: - we need to copy the batching rule implemented here Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32899678 Pulled By: soulitzer fbshipit-source-id: 54db9550dd2c93bc66b8fb2d36ce40799ebba794	2021-12-14 04:34:25 -08:00
soulitzer	af7ee9fc01	Forward AD for inplace comparison operators (#69597 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69597 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33020600 Pulled By: soulitzer fbshipit-source-id: 0c9ab210f7dc952a41fbcaa1f5f7921c2fdeb18b	2021-12-12 00:11:14 -08:00
soulitzer	baf92f9d5a	Fix copy_ forward AD to handle broadcasting (#69592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69592 Currently, forward AD function for`copy_` (in `VariableTypeManual`) does not handle the broadcasting case. ~EDIT: but that is not a design decision, not a bug. In this PR, we make that clear as a comment.~ Note: `broadcast_to` does not have a batching rule in core, so the ops that rely on `copy_` to broadcast will still fail batched forward grad computation. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33020603 Pulled By: soulitzer fbshipit-source-id: 09cb702bffc74061964a9c05cfef5121f8164814	2021-12-12 00:11:08 -08:00
milesial	0ccb1dcdbb	Fix inference_mode decorator (#68617 ) Summary: This fixes the case when `torch.inference_mode` is called with `mode=False` (disabled). When used as a decorator, it ignored the argument and enabled inference mode anyway. `_DecoratorContextManager` is changed so that a new instance is a copy instead of a new instance with default parameters. I also added more tests to cover this case. Current behaviour: ```python >>> import torch >>> x = torch.ones(1, 2, 3, requires_grad=True) >>> torch.inference_mode(mode=False) ... def func(x): ... return x * x ... >>> out = func(x) >>> out.requires_grad False ``` New behaviour (fixed): ```python >>> import torch >>> x = torch.ones(1, 2, 3, requires_grad=True) >>> torch.inference_mode(mode=False) ... def func(x): ... return x * x ... >>> out = func(x) >>> out.requires_grad True ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/68617 Reviewed By: mrshenli Differential Revision: D32958434 Pulled By: albanD fbshipit-source-id: 133c69970ef8bffb9fc9ab5142dedcffc4c32945	2021-12-09 10:45:09 -08:00
soulitzer	b61c532f96	Make make_dual redispatch (#68630 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68630 Constraints: 1) (functorch) if all the inputs to an op have requires_grad=False and don't have tangents, then their VariableType kernel should be a no-op i.e., behave like a redispatch. This is due to functorch's DynamicLayerStack having the autograd key by default (which is so that transformations like vmap) still work with autograd 2) (inference mode) inference tensors in inference mode will call straight into the kernel, we should still do something sensible inside even if we normally wouldn't redispatch into it. 3) ~Should support potential application of interposition below autograd: `nn.Parameter` is a example of subclassing where the subclass is not preserved when an operation is performed. There is an exception though: we want calling `make_dual` on a `nn.Parameter` to preserve its parameterness.~ 4) Should avoid calls to shallow_copy_and_detach to avoid spurious calls into `__python_dispatch__`. This PR: - does not redispatch to `make_dual` from its `ADInplaceOrView` kernel to satisfy (1) - calls into `alias` from the kernel in the native namespace so that behavior is consistent with other views in inference mode to satisfy (2) - discussion of (3). We still wouldn't be able to directly override `make_dual` below autograd. In this PR, instead of not redispatching at all, we choose to redispatch into `at::alias` so that one can override `make_dual`. The side effect is that one would not be able to distinguish calls between the two, which can be problematic (though a straightforward but hacky solution would be to create a new `at::alias_for_make_dual` that would allow users to distinguish) the two. This isn't ideal but seems to be the simplest way to satisfy (3). We don't pursue that hacky solution here. - (4) is satisfied because we remove calls to `shallow_copy_and_detach` <details> <summary> A potentially less hacky but more involved solution? (WIP) </summary> Realizing that make_dual is more like requires_grad, perhaps it shouldn't be autograd explicit? Make make_dual a composite or python-only construct. i.e., it would be a view on the primal followed by something to the effect of primal.set_fw_grad(tangent). Additional constraints: 5) make_dual needs to be backward-differentiable (I can't think of any applications yet becuase technically as a high-order function, jvp's input is the tangent only, "detach" is not applied on the tangent, so one would still be able to propagate gradients through it). 6) set_fw_grad needs to raise an error if there is a layout mismatch and base is a forward-differnentiable view Possible plan - (6) implies that a plain view would not suffice. We need a `detach`-like operation to ensure that set_fw_grad knows the view is not forward differentiable. - (5) implies that is this (new) `detach` would need to be backward differentiable (API TBD). - (3) is no longer relevant because make_dual is no longer autograd explicit, but perhaps this new detach should behave like the current one? There is a lot of logic to replicate for detach, so this may be hard. - (1) is satisfied if we use current detach logic, i.e., , and (4) is trivial. I'm not convinced that this is the right solution either, because in the end does (3) still work? </details> Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32899679 Pulled By: soulitzer fbshipit-source-id: 98e13ae954e14e1e68dbd03eb5ab3300d5ed2c5e	2021-12-08 17:56:03 -08:00
Rohan Varma	049debd97d	[Reland][Autograd/Checkpoint] Checkpoint implementation without reentrant autograd (#69508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69508 Original Phabricator Diff: D32704467 (`e032dae329`) Reland, fix is to not test traditional checkpoint when input does not require grad as that is unsupported as documented. Original PR body: Resubmission of https://github.com/pytorch/pytorch/pull/62964 with the suggestions and tests discussed in https://github.com/pytorch/pytorch/issues/65537. Adds a `use_reentrant=False` flag to `checkpoint` function. When `use_reentrant=True` is specified, a checkpointing implementation that uses SavedVariableHooks instead of re-entrant autograd is used. This makes it more composable with things such as `autograd.grad` as well as DDP (still need to add thorough distributed testing). As discussed in https://github.com/pytorch/pytorch/issues/65537, the tests that we need to add are: - [x] Gradient hooks are called once - [x] works when input does require grads but Tensor that require grads are captures (like first layer in a nn) - [x] works for functions with arbitrary input/output objects - [x] distributed tests (next PR) Note that this is only for `torch.utils.checkpoint`, if this approach overall looks good, we will do something similar for `checkpoint_sequential`. ghstack-source-id: 144948501 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D32902634 fbshipit-source-id: 2ee87006e5045e5471ff80c36a07fbecc2bea3fe	2021-12-07 16:31:23 -08:00
Michael Suo	59e98b66ac	Revert D32704467: [Autograd/Checkpoint] Checkpoint implementation without reentrant autograd Test Plan: revert-hammer Differential Revision: D32704467 (`e032dae329`) Original commit changeset: 6eea1cce6b93 fbshipit-source-id: 1a788c1fd57cee46bba82e216e6162d078359cc2	2021-12-06 16:33:32 -08:00
Rohan Varma	e032dae329	[Autograd/Checkpoint] Checkpoint implementation without reentrant autograd (#69027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69027 Resubmission of https://github.com/pytorch/pytorch/pull/62964 withe suggestions and tests discussed in https://github.com/pytorch/pytorch/issues/65537. Adds a `use_reentrant=False` flag to `checkpoint` function. When `use_reentrant=True` is specified, a checkpointing implementation that uses SavedVariableHooks instead of re-entrant autograd is used. This makes it more composable with things such as `autograd.grad` as well as DDP (still need to add thorough distributed testing). As discussed in https://github.com/pytorch/pytorch/issues/65537, we have added the following tests: -[ ] Gradient hooks are called once ghstack-source-id: 144644859 Test Plan: CI Reviewed By: pbelevich Differential Revision: D32704467 fbshipit-source-id: 6eea1cce6b935ef5a0f90b769e395120900e4412	2021-12-06 13:29:37 -08:00
soulitzer	5456d8c8f3	Add vectorized Jacobian and Hessian computation with forward AD (#67041 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67041 Original PR here: https://github.com/pytorch/pytorch/pull/62246 (The old PR does more things, but now that's split across this stack) This PR: - Adds "jacfwd" and "hessian_fwdrev" - Modifies existing tests to also test the `forward_ad=True` case Test Plan: Imported from OSS Reviewed By: gchanan, zou3519 Differential Revision: D32314424 Pulled By: soulitzer fbshipit-source-id: 785b0e39162b93dc3b3cb9413233447152eddd53	2021-11-19 14:31:09 -08:00
soulitzer	e358c49a5b	Add OpInfo test and fix a couple cases (#66294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66294 In this PR: - OpInfo for forward AD now checks batched forward grad when `op.check_batched_grad=True` - Adds setting to disable the test for individual ops `check_batched_forward_grad` and disable for the ops here: https://github.com/pytorch/pytorch/issues/66357 Fixes some more failures: - Make Forward AD metadata less strict by allowing stride to differ when size is 1 - Fix sum batching rule when logical tensor is a scalar and dim is unspecified - Batching rule for `_reshape_alias` - ~Batching rules now preserve storage offset for view operator that return non-zero storage offset~ (moved to previous PR) Test Plan: Imported from OSS Reviewed By: zou3519, albanD Differential Revision: D31842020 Pulled By: soulitzer fbshipit-source-id: 3517a8fb9d6291fccb53c0b1631eab5bbb24ebd1	2021-11-19 14:31:03 -08:00
soulitzer	2455cc2adf	Address case when layout of tangent is not same as base (#66292 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66292 In this PR: 1. Fix the case when tangent has a different layout from the base when `set_fw_grad` by adding a native function and its batching rule. For (1) we replace the following: ``` Tensor new_with_same_meta(const Variable& base) { int64_t nelement_in_storage = base.storage().nbytes() / base.itemsize(); auto new_tensor = at::zeros({nelement_in_storage}, base.options()); auto res = new_tensor.as_strided(base.sizes(), base.strides(), base.storage_offset()); return res; } ``` with a native function as to enable a batching rule to alter its behavior. This new function will be similar to `new_zeros_strided` except we also require the `storage_offset` and `storage_numel` arguments. Possible concerns: - Why have redundant logic? Why not add new args `new_zeros_strided`? This is probably a niche use case, so it's better not to complicate the current API. - Previously the created tensor inherits the TensorOptions of the primal. Now we inherit from the TensorOptions of the tangent. - Probably fine. Likely, no one relies on this because the behavior is only triggered when tangent/base have different layouts. - Why pass in exploded size, stride, and offset? It is possible in the non-batched case to pass in a tensor directly, but not possible when we'd like to have a batching rule. The size, stride, and offset we'd be passing won't belong to any live tensor. Test Plan: Imported from OSS Reviewed By: zou3519, albanD Differential Revision: D31842019 Pulled By: soulitzer fbshipit-source-id: a58433d814fd173bc43a2c550b395377dba40de2	2021-11-19 14:29:46 -08:00
soulitzer	913ac27112	Fixes forward AD codegen for multiple formulas (#68535 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67367 - Adds check to make sure forward grad itself does not have forward grad at the same level - Verify with `python test/test_ops.py -k test_forward_mode_AD_linalg_eigh_cpu_float64` that it fails the check before, but passes after the codegen update Before: ``` if (_any_has_forward_grad_eigenvalues) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); auto eigenvalues_new_fw_grad = eigh_jvp_eigenvalues(self_t, eigenvalues, eigenvectors); if (eigenvalues_new_fw_grad.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvalues._set_fw_grad(eigenvalues_new_fw_grad, /* level / 0, / is_inplace_op / false); } } if (_any_has_forward_grad_eigenvectors) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); auto eigenvectors_new_fw_grad = eigh_jvp_eigenvectors(self_t, eigenvalues, eigenvectors); if (eigenvectors_new_fw_grad.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvectors._set_fw_grad(eigenvectors_new_fw_grad, / level / 0, / is_inplace_op / false); } } ``` After: ``` c10::optional<at::Tensor> eigenvalues_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_eigenvalues) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); eigenvalues_new_fw_grad_opt = eigh_jvp_eigenvalues(self_t, eigenvalues, eigenvectors); } c10::optional<at::Tensor> eigenvectors_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_eigenvectors) { auto self_t_raw = toNonOptFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toNonOptTensor(self)); eigenvectors_new_fw_grad_opt = eigh_jvp_eigenvectors(self_t, eigenvalues, eigenvectors); } if (eigenvalues_new_fw_grad_opt.has_value() && eigenvalues_new_fw_grad_opt.value().defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvalues._set_fw_grad(eigenvalues_new_fw_grad_opt.value(), / level / 0, / is_inplace_op / false); } if (eigenvectors_new_fw_grad_opt.has_value() && eigenvectors_new_fw_grad_opt.value().defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. eigenvectors._set_fw_grad(eigenvectors_new_fw_grad_opt.value(), / level / 0, / is_inplace_op */ false); } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/68535 Reviewed By: ngimel Differential Revision: D32536089 Pulled By: soulitzer fbshipit-source-id: a3f288540e2d78a4a9ec4bd66d2c0f0e65dd72cd	2021-11-18 17:44:17 -08:00
soulitzer	22e73f616c	Update unpack_dual to return named tuple (#68062 ) Summary: Also updates the doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/68062 Reviewed By: gchanan Differential Revision: D32315089 Pulled By: soulitzer fbshipit-source-id: 567c812da093daeb6549b0dc7ecbffd58eb8ccc2	2021-11-10 14:14:55 -08:00
Nikita Vedeneev	db456d16ee	`torch.lobpcg.backward`: do not save non-Variable types with `ctx.save_for_backward`. (#67994 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67827 cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/67994 Reviewed By: H-Huang Differential Revision: D32244818 Pulled By: albanD fbshipit-source-id: 702a3a1d1f4c160bef7ec1f764a2ab5d01ca7901	2021-11-08 10:02:09 -08:00
soulitzer	823ae3a4ff	[forward ad] Also check layout of grad matches that of self for inplace over view (#67816 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67800 Currently when the grad is the same layout as base, we try to assign the same tensor to the forward grad of both the base and the view. However, when the layout of the grad is different from the layout of the view, this triggers a copy to be created, and the tangent of the view (after the inplace) will not have a view relationship with the view of the base. This PR just changes it so that we only do the above optimization when the layout also matches the layout of self Pull Request resolved: https://github.com/pytorch/pytorch/pull/67816 Reviewed By: malfet Differential Revision: D32190021 Pulled By: soulitzer fbshipit-source-id: b1b2c9b332e83f4df5695ee9686ea76447f9305b	2021-11-05 10:26:24 -07:00
soulitzer	83e8612d11	Clean up test autograd (#67413 ) Summary: Partially fixes https://github.com/pytorch/pytorch/issues/66066 This PR: - cleans up op-specific testing from test_autograd. test_autograd should be reserved for testing generic autograd functionality - tests related to an operator are better colocated - see the tracker for details What to think about when moving tests to their correct test suite: - naming, make sure its not too generic - how the test is parametrized, sometimes we need to add/remove a device/dtype parameter - can this be merged with existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/67413 Reviewed By: jbschlosser, albanD Differential Revision: D32031480 Pulled By: soulitzer fbshipit-source-id: 8e13da1e58a38d5cecbfdfd4fe2b4fe6f816897f	2021-11-03 15:26:09 -07:00
kshitij12345	885a8e53ba	replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201 ) Summary: Reference https://github.com/pytorch/pytorch/issues/53849 Replace `onlyOnCPUandCUDA` with `onlyNativeDeviceTypes` which includes `cpu, cuda and meta`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65201 Reviewed By: mrshenli Differential Revision: D31299718 Pulled By: mruberry fbshipit-source-id: 2d8356450c035d6a314209ab51b2c237583920fd	2021-11-01 09:22:34 -07:00
albanD	b27b1ff809	Fix deadlock when forward and backward AD are used at the same time (#67360 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67360 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D31973040 Pulled By: albanD fbshipit-source-id: f9c75c6497b622c86e8653027bce45461304eff5	2021-10-28 09:11:36 -07:00
soulitzer	f2f7b02b4c	Add support for vmap+fwdAD for basic out-of-place op (#66291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66291 In this PR: - Trivial batching rules for `make_dual` and `is_same_size` that enable forward ad + vmap functionality - Adds a check in gradcheck that is performed when both `check_batched_grad` and `check_forward_ad` are `True` (an OpInfo using this is added later in the stack). - Tests for the gradcheck functionality - Tests that basic out-of-place op works Test Plan: Imported from OSS Reviewed By: albanD, saketh-are Differential Revision: D31842018 Pulled By: soulitzer fbshipit-source-id: 84b18d9a77eeb19897757e37555581f2a9dc43d8	2021-10-27 08:55:06 -07:00
soulitzer	892ac08a02	Do not generate not_implemented error for forward AD when input with tangent passed to non-differentiable function (#66926 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/61926 1. update the `if` to just use requires_derivative since that should reflect when function is not differentiable 2. if `requires_derivative=True` but no outputs have forward derivatives, we should error as usual 3. ~In the future we may also want to handle the case~ when `len(fw_derivatives) > 0 and len(fw_derivatives) < num_diff_outputs` we should add assert in codegen that this does not happen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/66926 Reviewed By: anjali411 Differential Revision: D31810736 Pulled By: soulitzer fbshipit-source-id: 11a14477cc7554f576cff2ed1711a448a8c6a66a	2021-10-21 13:53:07 -07:00
Jane Xu	299a6a65b2	[skip ci] Set test owners for autograd tests (#66834 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66834 Reviewed By: albanD Differential Revision: D31761778 Pulled By: janeyx99 fbshipit-source-id: 355edfb1b940154e84fbba6f7b096605e75ae459	2021-10-19 08:35:02 -07:00
lezcano	0974215c4d	Prefer mT and mH over transpose(-2, -1) and transpose(-2, -1).conj() (#64181 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64181 This PR replaces all the calls to: - `transpose(-2, -1)` or `transpose(-1, -2)` by `mT()` in C++ and `mT` in Python - `conj().transpose(-2, -1)` or `transpose(-2, -1).conj()` or `conj().transpose(-1, -2)` or `transpose(-1, -2).conj()` by `mH()` in C++ and `mH` in Python. It also simplifies two pieces of code, and fixes one bug where a pair of parentheses were missing in the function `make_symmetric_matrices`. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D31692896 Pulled By: anjali411 fbshipit-source-id: e9112c42343663d442dc5bd53ff2b492094b434a	2021-10-18 13:02:25 -07:00
Peter Bell	5f45927d15	Autograd: Delay warnings until the end of backward execution (#66235 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50209 This adds a new warning handler that stores all warnings in a shared queue, which can be "replayed" at a later time and, crucially, on another thread. Then, I use this inside the autograd engine to ensure that warnings are processed by the handler registered on the main thread. For testing, I also add an operator that always warns in the backward pass and test that the warning is a normal Python warning. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66235 Reviewed By: ejguan Differential Revision: D31505413 Pulled By: albanD fbshipit-source-id: 1a7f60b038f55c20591c0748b9e86735b3fec2f9	2021-10-13 15:38:04 -07:00
soulitzer	73901b099d	Add batched_grad parameter to `autograd.grad` (#65564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65564 - wrap the call into engine with vmap if `batched_grad` is `True` - improves the comment on the call to engine (somewhat addressing https://github.com/pytorch/pytorch/issues/41659) - borrows the message from functional.jacobian's vectorized argument concerning usage of the vmap feature - adds basic test (further testing is done when we replace the usage in vectorized jacobian computation) TODO: - create an issue tracking this Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D31236259 Pulled By: soulitzer fbshipit-source-id: b33e6b26ea98fa9f70c44da08458fc54ba4df0f7	2021-10-03 19:55:06 -07:00
soulitzer	91611fe1d1	Decouple forward AD checks from backward AD in OpInfo tests and gradcheck (#65040 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/64999 - Adds a flag to gradcheck `check_backward_ad` that can be used to disable gradcheck for backward ad - This is a bit bc-breaking in terms of positional args, but I prefer this ordering - In OpInfo tests for forward ad: - set `check_backward_ad` False - In test_ops treat `supports_autograd` as if it is `supports_backward_ad` (it basically already is) - the only modification needed is to no longer skip forward ad tests if `supports_autograd` is false - test_dtype, test_variant_consistency, etc behave correctly as-is - In a follow-up PR, we can rename it to actually be `supports_backward_ad` - Testing - https://github.com/pytorch/pytorch/pull/65060 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65040 Reviewed By: albanD Differential Revision: D31238177 Pulled By: soulitzer fbshipit-source-id: f068d4cbe7ffb094930b16cddb210583b9b7b2c4	2021-09-29 17:01:34 -07:00
Yukio Siraichi	c829cb6840	Port `min` kernel to structured kernels. (#61450 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61450 Tracking issue: #55070 Test Plan: Imported from OSS Reviewed By: saketh-are Differential Revision: D29741713 Pulled By: bdhirsh fbshipit-source-id: 2c107752a90fd39cfb55e08aaf3541bd484a5fc3	2021-09-28 14:03:54 -07:00
soulitzer	4bf7959de2	Remove `run_functional_checks` from `test_autograd` and create necessary OpInfos (#64993 ) Summary: OpInfo tracker: https://github.com/pytorch/pytorch/issues/54261 - Eliminate duplicated testing logic in test_autograd - Moved tests that rely on this testing logic to use OpInfos - `cat` already has OpInfo (no action needed) - Created OpInfo for `block_diag` and `broadcast_tensors` Running into some FX errors. Added op to skip-list and created an issue here: https://github.com/pytorch/pytorch/issues/64997 Both `block_diag` and `broadcast_tensors` are variadic, so skipping `test_variant_consistency_jit` (from comments on other OpInfos, it looks like JIT does not support variadic tensors) Pull Request resolved: https://github.com/pytorch/pytorch/pull/64993 Reviewed By: jbschlosser Differential Revision: D30961736 Pulled By: soulitzer fbshipit-source-id: e169305384a683acae1178c4e12e9e214a67226a	2021-09-15 12:45:38 -07:00
Victor Quach	8131bc85d0	Raise TypeError on assigned grad with wrong type (#64876 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/64813 Raises a TypeError when assigned value to a grad is not a Tensor or None. Adds tests. cc ezyang gchanan Pull Request resolved: https://github.com/pytorch/pytorch/pull/64876 Reviewed By: anjali411 Differential Revision: D30901678 Pulled By: soulitzer fbshipit-source-id: dbb3cb5fd0bbac6918e0b2e2f51d340daa43dee0	2021-09-13 16:41:45 -07:00
kshitij12345	2c351c76e0	[special] Alias igamma, igammac to special.gammaninc, special.gammaincc (#61902 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/50345 Also added relevant OpInfo TODO: * [x] Check rendered docs gammainc : https://docs-preview.pytorch.org/61902/special.html#torch.special.gammainc * [x] Check rendered docs gammaincc: https://docs-preview.pytorch.org/61902/special.html#torch.special.gammaincc Pull Request resolved: https://github.com/pytorch/pytorch/pull/61902 Reviewed By: ngimel Differential Revision: D30761428 Pulled By: mruberry fbshipit-source-id: 06a16432873357958d53364f12a4e91c29779d26	2021-09-07 15:31:26 -07:00
Philip Meier	26b7ff5aea	deprecate dtype getters from `torch.testing` namespace (#63554 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63554 Following https://github.com/pytorch/pytorch/pull/61840#issuecomment-884087809, this deprecates all the dtype getters publicly exposed in the `torch.testing` namespace. The reason for this twofold: 1. If someone is not familiar with the C++ dispatch macros PyTorch uses, the names are misleading. For example `torch.testing.floating_types()` will only give you `float32` and `float64` skipping `float16` and `bfloat16`. 2. The dtype getters provide very minimal functionality that can be easily emulated by downstream libraries. We thought about [providing an replacement](https://gist.github.com/pmeier/3dfd2e105842ad0de4505068a1a0270a), but ultimately decided against it. The major problem is BC: by keeping it, either the namespace is getting messy again after a new dtype is added or we need to somehow version the return values of the getters. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D30662206 Pulled By: mruberry fbshipit-source-id: a2bdb10ab02ae665df1b5b76e8afa9af043bbf56	2021-09-07 08:58:51 -07:00
Anirudh Dagar	1a1fb31cfa	Support `torch.concat` alias, add `cat` OpInfo & remove OpInfo test_out skips {cat, stack, hstack, vtack, dstack} (#62560 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/61767 ## Changes - [x] Add `torch.concat` alias to `torch.cat` - [x] Add OpInfo for `cat`/`concat` - [x] Fix `test_out` skips (Use `at::native::resize_output` or `at::native::resize_output_check`) - [x] `cat`/`concat` - [x] `stack` - [x] `hstack` - [x] `dstack` - [x] `vstack`/`row_stack` - [x] Remove redundant tests for `cat`/`stack` ~I've not added `cat`/`concat` to OpInfo `op_db` yet, since cat is a little more tricky than other OpInfos (should have a lot of tests) and currently there are no OpInfos for that. I can try to add that in a subsequent PR or maybe here itself, whatever is suggested.~ Edit: cat/concat OpInfo has been added. Note: I've added the named tensor support for `concat` alias as well, maybe that's out of spec in `array-api` but it is still useful for consistency in PyTorch. Thanks to krshrimali for guidance on my first PR :)) cc mruberry rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi heitorschueroff krshrimali Pull Request resolved: https://github.com/pytorch/pytorch/pull/62560 Reviewed By: saketh-are Differential Revision: D30762069 Pulled By: mruberry fbshipit-source-id: 6985159d1d9756238890488a0ab3ae7699d94337	2021-09-06 23:57:18 -07:00
Michael Dagitses	b737629ff0	simplify op name determination into a single forward pass (#64261 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64261 Note that this does not preserve byte-for-byte compatibility with existing names. Test Plan: * Rely on CI to catch gross errors. * Merge after release cut to catch subtle issues. Reviewed By: albanD Differential Revision: D30700647 Pulled By: dagitses fbshipit-source-id: 7b02f34b8fae3041240cc78fbc6bcae498c3acd4	2021-09-02 07:32:11 -07:00
Michael Dagitses	cdb46f4c6e	extract TestAutogradComplex into its own test file (#63400 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63400 This is the first step to break up test_autograd.py for #63205. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D30541499 Pulled By: dagitses fbshipit-source-id: 8d9d32007938b9eade0e88f95a6a3190e7e2ef01	2021-09-02 04:34:35 -07:00
Alban Desmaison	e322547fe6	Add forward AD support for custom Functions (#64061 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64061 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D30640868 Pulled By: albanD fbshipit-source-id: b0e6610430a879074d6d5306443772fc154b431f	2021-09-01 14:33:09 -07:00
Rohan Varma	421d8f86b6	Add a record scope around autograd::engine::evaluate_function (#63619 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63619 Adds a RECORD_FUNCTION with the function that is being valuate as part of backwards execution. This has been useful in picking up some operations in the backwards pass that otherwise would not show up, for example custom cpp functions that use custom C++ code. ghstack-source-id: 137041723 Test Plan: CI benchmark: buck run mode/opt //scripts/rvarm1/ddp:bench Reviewed By: albanD Differential Revision: D30439492 fbshipit-source-id: 955917770cdf2a2edb0303223ace710b668ba388	2021-09-01 12:32:30 -07:00
Kushashwa Ravi Shrimali	d37636901e	[Doc] `make_tensor` to `torch.testing` module (#63925 ) Summary: This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs. TODOs: * [x] Add examples cc: pmeier mruberry brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925 Reviewed By: ngimel Differential Revision: D30633487 Pulled By: mruberry fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af	2021-08-30 12:25:40 -07:00
Philip Meier	57d4c6cf42	replace `self.assertTrue(torch.allclose(..))` with `self.assertEqual(…)` (#63637 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/63565 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63637 Reviewed By: malfet Differential Revision: D30541266 Pulled By: mruberry fbshipit-source-id: ab461949782c6908a589ea098fcfcf5c3e081ee6	2021-08-25 16:47:40 -07:00
yanbing-j	33a163d886	Enable BFloat16 LeakyReLU and RReLU in CPU path (#61514 ) Summary: Enable and optimize BFloat16 LeakyReLU and RReLU in CPU path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/61514 Reviewed By: ejguan Differential Revision: D30257612 Pulled By: VitalyFedyunin fbshipit-source-id: 8cc0d1faacd02dcc9827af724a86d95b6952748f	2021-08-24 08:34:56 -07:00
Alban Desmaison	bafd875f74	Allow implementing either backward or vjp for Function (#63434 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63434 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D30431968 Pulled By: albanD fbshipit-source-id: 0bb88664283486a9fd3364e6c3d79442a44625c2	2021-08-23 07:07:11 -07:00
Victor Quach	7bad9ac78a	Fix flaky test for dp saved tensor hooks (#63324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63324 Fix for https://www.internalfb.com/tasks/?t=98258963 `catch_warnings` seem to only trigger once in certain cases where it should trigger twice. This test is only meant to test whether hooks are trigger / not trigger, so changing it to self.assertGreater is ok. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D30340833 Pulled By: Varal7 fbshipit-source-id: 1bfb9437befe9e8ab8f95efe5f513337fa9bdc5c	2021-08-17 08:56:58 -07:00
Victor Quach	5abeac3ef7	Make saved tensors default hooks thread local (#62909 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62909 This PR makes saved tensors default hooks thread local. This allows using default hooks in a multithreaded context. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D30165416 Pulled By: Varal7 fbshipit-source-id: 10a7d580661d3d94bdaf398c4e076b7bea11c16b	2021-08-13 07:49:20 -07:00
Victor Quach	ed7ece389d	Forbid inplace modification of a saved tensor's pack_hook input (#62717 ) Summary: When using saved tensors hooks (especially default hooks), if the user defines a `pack_hook` that modifies its input, it can cause some surprising behavior. The goal of this PR is to prevent future user headache by catching inplace modifications of the input of `pack_hook` and raising an error if applicable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62717 Reviewed By: albanD Differential Revision: D30255243 Pulled By: Varal7 fbshipit-source-id: 8d73f1e1b50b697a59a2849b5e21cf0aa7493b76	2021-08-12 12:40:10 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Victor Quach	557047eb4c	Add docstring for saved tensors default hooks (#62361 ) Summary: Add documentation for the saved tensors default hooks introduced in https://github.com/pytorch/pytorch/issues/61834 / https://github.com/pytorch/pytorch/issues/62563 Sister PR: https://github.com/pytorch/pytorch/issues/62362 (will add a link from autograd.rst to notes/autograd in whatever PR does not land first) Pull Request resolved: https://github.com/pytorch/pytorch/pull/62361 Reviewed By: zou3519 Differential Revision: D30081997 Pulled By: Varal7 fbshipit-source-id: cb923e943e1d96db9669c1d863d693af30910c62	2021-08-10 14:59:38 -07:00
kshitij12345	f836c4f8bd	[fix] TestMultiThreadAutograd: propagate exception from child thread to main thread (#63018 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/62895 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63018 Reviewed By: anjali411 Differential Revision: D30225856 Pulled By: Varal7 fbshipit-source-id: b5dd7999de5060e06f8958ea3ce49e0b74110971	2021-08-10 13:56:49 -07:00
Ilia Cherniavskii	773a8eede4	[profiler][refactor] Refactor the usage of legacy profiler implementation (#61931 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61931 This PR consolidates the profiling code around a new C++ implementation (profiler_kineto.h/cpp) and uses it unconditionally from torch.autograd.profiler/torch.profiler: 1. Always use profiler_kineto.h/cpp as the C++ implementation 2. Simplify profiler.py to remove unneeded parts depending on legacy impl 3. Move some of the legacy logic into profiler_legacy.py (to be fully deleted later) Test Plan: USE_KINETO=1 USE_CUDA=1 USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 python setup.py develop install --cmake python test/test_profiler.py -v USE_KINETO=0 USE_CUDA=1 USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 python setup.py develop install --cmake python test/test_profiler.py -v Imported from OSS Reviewed By: gdankel Differential Revision: D29801599 fbshipit-source-id: 9794d29f2af38dddbcd90dbce4481fc8575fa29e	2021-08-03 18:51:29 -07:00
Victor Quach	9beb279d84	Add context manager to save tensors on CPU (#61928 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61928 Fix #57100. Creates a function `torch.autograd.graph.set_save_on_cpu_hooks()` which can be used to register default hooks under which all tensors saved during the forward pass are actually copied* to cpu, then copied back to the appropriate device for the backward pass. *If the tensor was already on cpu, the entire operation is a no op. If the tensor is on GPU, we copy the tensor to `pin_memory` during packing so that the unpacking can be done asynchronously. See [benchmark](https://github.com/pytorch/pytorch/pull/61928#issuecomment-885089279) and [note about training large models](https://github.com/pytorch/pytorch/pull/61928#issuecomment-887009448) Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29848526 Pulled By: Varal7 fbshipit-source-id: 3d289cddd4fa377bd4884ba0d569fa47c777d9e5	2021-08-03 13:08:37 -07:00
Victor Quach	b161ac541d	[reland] Add default Saved Variable hooks (#62563 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62563 Expose a pair of functions to Python users: torch.autograd.graph.set_saved_tensors_default_hooks(pack, unpack) and torch.autograd.graph.reset_saved_tensors_default_hooks(). These functions control the hooks applied to saved tensors: all tensors saved in that context will be packed using the pack function, then unpacked accordingly when needed. Currently, this works by simply calling register_hooks (cf #60975) directly at the end of the constructor of a SavedVariable. This could be optimized further by not performing the copy before registering default hooks, but this would require a small refactor. Edit: the refactor is done in #61927. A current limitation is that if users create tensors in this context, they will not be able to register additional hooks on the saved tensor. For instance, to perform something like #28997, one could define a pack function that saves to disk whenever the tensor size is too big and returns a filename, then unpack simply reads the content of the file and outputs a tensor, e.g.: ``` def pack(x): name = os.path.join(tmp_dir, str(uuid.uuid4())) torch.save(x, name) return name def unpack(name): return torch.load(name) ``` Relanding previous PR: https://github.com/pytorch/pytorch/pull/61834 Original PR led to timeout error in: https://www.internalfb.com/mast/job/yuguo-release_canary_offline_training-inlinecvrp_a-canary_offline_train_28a7ecfc Now passing: https://www.internalfb.com/mast/job/quach-release_canary_offline_training-inlinecvrp_a-canary_offline_train_9bb57e98 The difference with the new version is we don't need to acquire the GIL when calling `PyDefaultSavedVariableHooks::get_hooks`. Test Plan: Imported from OSS Reviewed By: iramazanli Differential Revision: D30045405 Pulled By: Varal7 fbshipit-source-id: 7f6c07af3a56fe8835d5edcc815c15ea4fb4e332	2021-08-02 11:30:26 -07:00
kshitij12345	cb626da145	[fix] mark non-differentiable ops (#62529 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/62506 Fixes https://github.com/pytorch/pytorch/issues/62504 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62529 Reviewed By: albanD Differential Revision: D30032665 Pulled By: malfet fbshipit-source-id: 90254c50fb4a873e3eda59c8484626137e01cb31	2021-08-02 09:40:45 -07:00
Yu Guo	5c47038d12	Back out D29792193 "Add default Saved Variable hooks" (#62415 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62415 test error Differential Revision: D29990361 fbshipit-source-id: 99c87dec6c5be6496c9db5c9205c3cb72a953dd9	2021-07-29 16:31:00 -07:00
albanD	4c3eea26bd	Fix out= variant forward grad detection (#60499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60499 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29914595 Pulled By: albanD fbshipit-source-id: c51bb3aed91ab1f6ebc57936143b249590a43bd5	2021-07-27 13:06:45 -07:00
Victor Quach	be17d6eadf	Add default Saved Variable hooks (#61834 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61834 Expose a pair of functions to Python users: torch.autograd.graph.set_saved_tensors_default_hooks(pack, unpack) and torch.autograd.graph.reset_saved_tensors_default_hooks(). These functions control the hooks applied to saved tensors: all tensors saved in that context will be packed using the pack function, then unpacked accordingly when needed. Currently, this works by simply calling register_hooks (cf #60975) directly at the end of the constructor of a SavedVariable. This could be optimized further by not performing the copy before registering default hooks, but this would require a small refactor. Edit: the refactor is done in #61927. A current limitation is that if users create tensors in this context, they will not be able to register additional hooks on the saved tensor. For instance, to perform something like #28997, one could define a pack function that saves to disk whenever the tensor size is too big and returns a filename, then unpack simply reads the content of the file and outputs a tensor, e.g.: ``` def pack(x): name = os.path.join(tmp_dir, str(uuid.uuid4())) torch.save(x, name) return name def unpack(name): return torch.load(name) ``` Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D29792193 Pulled By: Varal7 fbshipit-source-id: 33e931230ef59faa3ec8b5d11ef7c05539bce77c	2021-07-26 08:14:32 -07:00
Philip Meier	10ccc5a81c	remove `randn?` from `torch.testing` namespace (#61840 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61840 Redo of #60859. Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29871017 Pulled By: mruberry fbshipit-source-id: 47afed1dc6aa0bb1e826af616ef5d5aaabb8e5bb	2021-07-23 11:51:03 -07:00
Nikita Shulga	604f503d30	Revert D29794958 + compilation fix (#61937 ) Summary: This PR un-reverts https://github.com/pytorch/pytorch/issues/61475 + fixes compilation with MSVC, that does not recognize alternative operator spellings (i.e. using `or` instead of `\|\|` ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/61937 Reviewed By: albanD Differential Revision: D29805941 Pulled By: malfet fbshipit-source-id: 01e5963c6717c1b44b260300d87ba0bf57f26ce9	2021-07-20 18:14:45 -07:00
Nikita Shulga	22fff61f06	Revert D29794958: [pytorch][PR] changing trapz to trapezoid Test Plan: revert-hammer Differential Revision: D29794958 (`95cec8f4fa`) Original commit changeset: 60b9c07efd47 fbshipit-source-id: 2dcda2d62e01c2521a86ae5ed8246cfb686d3f64	2021-07-20 16:00:46 -07:00
Kevin Tse	95cec8f4fa	changing trapz to trapezoid (#61475 ) Summary: This PR resolves issue https://github.com/pytorch/pytorch/issues/52606 while also adding support for complex number Stack from [ghstack](https://github.com/ezyang/ghstack): * https://github.com/pytorch/pytorch/issues/61616 * https://github.com/pytorch/pytorch/issues/61615 * https://github.com/pytorch/pytorch/issues/61475 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61475 Reviewed By: mruberry Differential Revision: D29794958 Pulled By: NivekT fbshipit-source-id: 60b9c07efd47fd85b9c8178768fc7828d7b57d29	2021-07-20 15:25:55 -07:00
Victor Quach	ff82394fc0	Apply saved tensor hooks (#60975 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60975 Fixes #58512 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29466227 fbshipit-source-id: c1498d52173aceb29638b5c4f521ac05356a5958	2021-07-18 08:42:51 -07:00
Victor Quach	ee5a97de11	Register Saved Tensors hooks (#60663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60663 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29466223 fbshipit-source-id: 65dc3a935c18a0e6b93a37e24543c696e6ae0321	2021-07-15 08:09:55 -07:00
Peter Bell	429436edbd	Avoid complex-to-real cast warning in CopyBackward (#60021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60021 Dropping the imaginary component is expected and gives the correct gradient formula, so silencing the warning is appropriate. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D29589371 Pulled By: mruberry fbshipit-source-id: 73e1511cae69207dc9abe576e2769ee1d03f1bbd	2021-07-07 15:28:38 -07:00
Victor Quach	5b44d817fb	Expose raw saved tensors for codegen functions (#60565 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60565 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D29466225 fbshipit-source-id: 77eb4214a1baecc501282413d99d55f8935dc01f	2021-07-01 11:25:21 -07:00
Victor Quach	a5e2ea4345	Add noop register hook (#60685 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60685 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29466224 fbshipit-source-id: 68c8aa022ccffeefd45062f1443d15c9a6824f3d	2021-06-30 07:46:34 -07:00
Victor Quach	f54290fd72	Expose raw saved tensors for custom functions (#60551 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60551 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D29466228 fbshipit-source-id: 7565f6cc3f2488c7e444cf81c7eb37a60c75b0e8	2021-06-29 17:21:52 -07:00
Xiong Wei	7e3a694b23	supports non-leaf inputs for autograd.backward() function (#60521 ) Summary: Close https://github.com/pytorch/pytorch/issues/60268 Pull Request resolved: https://github.com/pytorch/pytorch/pull/60521 Reviewed By: ngimel Differential Revision: D29393586 Pulled By: albanD fbshipit-source-id: 2dd2de427ecfecca8d544237bacf690e0b7c918c	2021-06-25 18:57:26 -07:00
Jeffrey Wan	b34965435d	Improve testing of inplace views (#59891 ) Summary: Partially addresses https://github.com/pytorch/pytorch/issues/49825 by improving the testing - Rename some of the old tests that had "inplace_view" in their names, but actually mean "inplace_[update_]on_view" so there is no confusion with the naming - Adds some tests in test_view_ops that verify basic behavior - Add tests that creation meta is properly handled for no-grad, multi-output, and custom function cases - Add test that verifies that in the cross dtype view case, the inplace views won't be accounted in the backward graph on rebase as mentioned in the issue. - Update inference mode tests to also check in-place Pull Request resolved: https://github.com/pytorch/pytorch/pull/59891 Reviewed By: albanD Differential Revision: D29272546 Pulled By: soulitzer fbshipit-source-id: b12acf5f0e3f788167ebe268423cdb58481b56f6	2021-06-22 12:28:09 -07:00
Michael Dagitses	91451369ed	require non-empty inputs to grad() calls in the API (#52016 ) Summary: The grad() function needs to return the updated values, and hence needs a non-empty inputs to populate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52016 Test Plan: Passes Python and C++ unit tests, and added new tests to catch this behavior. Fixes https://github.com/pytorch/pytorch/issues/47061 Reviewed By: albanD Differential Revision: D26406444 Pulled By: dagitses fbshipit-source-id: 023aeca9a40cd765c5bad6a1a2f8767a33b75a1a	2021-06-22 10:10:58 -07:00
albanD	8a839c5478	Fix saved variable unpacking version counter (#60195 ) Summary: We only set the value and not the actual VC. This means that in the context of double backward, if that saved tensor is saved again and the original Tensor is modified inplace, we would not detect it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60195 Reviewed By: Varal7 Differential Revision: D29208766 Pulled By: albanD fbshipit-source-id: 81175f8e3f111f89524f8e46f47577b2ea4fc945	2021-06-18 04:36:46 -07:00
Victor Quach	1efa863837	Avoid un-necessary unwrapping of Tensor in SavedVariable (#59837 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59837 Fixes #58500 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D29069215 fbshipit-source-id: 603db3c8a64b729e86385ed774825f01c6ce0f20	2021-06-16 16:43:04 -07:00
Michael Carilli	be038d8989	[CUDA graphs] Make stream semantics of backward calls consistent with other cuda ops (ci-all edition) (#57833 ) Summary: ci-all resubmit of https://github.com/pytorch/pytorch/pull/54227. Tests look good except for a few distributed autograd failures (pytorch_linux_xenial_cuda10_2_cudnn7_py3_multigpu_test) and rocm failures (pr/pytorch-linux-bionic-rocm4.1-py3.6). The common denominator in rocm failures appears to be multi-gpu activity: some [multiprocess DDP failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test1/8115/console), some [single-process failures](https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-linux-bionic-rocm4.1-py3.6-test2/8115/console) where the single process has autograd ops that span devices. jeffdaily jithunnair-amd sunway513, could one of you take a look? The streaming backward change is also beneficial to rocm, I expect. For debugging rocm failures, I think we should ignore the multiprocess/DDP tests and focus on the single process cases. The root cause is probably the same and the single process cases are simpler. ---------------------------------- Update: Rocm failures are due to https://github.com/pytorch/pytorch/issues/59750. `2718a54032` is a workaround, to be updated once https://github.com/pytorch/pytorch/issues/59750 is fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57833 Reviewed By: mruberry Differential Revision: D28942391 Pulled By: ngimel fbshipit-source-id: d6047e971c5f1c6386334bf3641402a92f12e2f8	2021-06-13 12:09:56 -07:00
albanD	e6110d4d5d	Fix input_buffer check if inplace update is valid (#59817 ) Summary: Fixes an issue introduced in https://github.com/pytorch/pytorch/issues/17182 Pull Request resolved: https://github.com/pytorch/pytorch/pull/59817 Reviewed By: bdhirsh Differential Revision: D29040738 Pulled By: albanD fbshipit-source-id: 67fd4e9fa0dadf507ddd954d20e119d8781c4de0	2021-06-11 07:29:03 -07:00
Victor Quach	0fa3db5594	Fix subgradient for element-wise max and min (#59669 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59669 Fixes #56734 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D28975531 fbshipit-source-id: 4e774dc8c6e095bc66962ce2411466de3880c2d3	2021-06-09 15:21:45 -07:00
Jeffrey Wan	1733d10399	Warn when backward() is called with create_graph=True (#59412 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/4661 - Add warnings in engine's `execute` function so it can be triggered through both cpp and python codepaths - Adds an RAII guard version of `c10::Warning::set_warnAlways` and replaces all prior usages of the set_warnAlways with the new one Pull Request resolved: https://github.com/pytorch/pytorch/pull/59412 Reviewed By: jbschlosser Differential Revision: D28969294 Pulled By: soulitzer fbshipit-source-id: b03369c926a3be18ce1cf363b39edd82a14245f0	2021-06-08 17:19:04 -07:00
Victor Quach	5fc105b323	Raise NotImplementedError on forward passes (#59483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59483 ... for functions that are not implemented Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D28933806 fbshipit-source-id: dadae1af6609f15419cf0f47a98361dc87dff849	2021-06-08 14:03:19 -07:00
Victor Quach	c268eefe96	Use TORCH_CHECK_NOT_IMPLEMENTED for AD not implemented (#59482 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59482 Fixes #53398 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D28933809 fbshipit-source-id: 53387ec9690fc235b0622b50800feced706ea1ee	2021-06-08 14:02:04 -07:00
Mike Ruberry	de40c8e495	Adds remaining OpInfos and removes redundant test generators (#55558 ) Summary: Per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55558 Reviewed By: ngimel Differential Revision: D28922522 Pulled By: mruberry fbshipit-source-id: 89cefd93788bc8aa0683f4583cf5caa81aa2dc93	2021-06-06 14:52:26 -07:00
anjali411	3607478ecd	Conjugate View (#54987 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987 Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype: Here's a summary of the changes in this PR: This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose). 1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor. 2. NEW API: a) `.conj()` -- now returning a view. b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory. c) `.conj_physical_()`, and `out=` variant d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0. e) `.resolve_conj_()` in-place version of (d) f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors. g) `view_as_real` -- existing function, but now errors out on conjugated tensors. 3. Conjugate Fallback a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor. b) This fallback is well equipped to handle the following cases: - functional operation e.g., `torch.sin(input)` - Mutable inputs and in-place operations e.g., `tensor.add_(2)` - out-of-place operation e.g., `torch.sin(input, out=out)` - Tensorlist input args - NOTE: Meta tensors don't work with conjugate fallback. 4. Autograd a) `resolve_conj()` is an identity function w.r.t. autograd b) Everything else works as expected. 5. Testing: a) All method_tests run with conjugate view tensors. b) OpInfo tests that run with conjugate views - test_variant_consistency_eager/jit - gradcheck, gradgradcheck - test_conj_views (that only run for `torch.cfloat` dtype) NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit. Follow up work: 1. conjugate view RFC 2. Add neg bit to re-enable view operation on conjugated tensors 3. Update linalg functions to call into specialized functions that fast path with the hermitian operation. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28227315 Pulled By: anjali411 fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f	2021-06-04 14:12:41 -07:00
Jeffrey Wan	4ae5764d47	Add is_inference to native functions (#58729 ) Summary: Adds `is_inference` as a native function w/ manual cpp bindings. Also changes instances of `is_inference_tensor` to `is_inference` to be consistent with other properties such as `is_complex`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58729 Reviewed By: mruberry Differential Revision: D28874507 Pulled By: soulitzer fbshipit-source-id: 0fa6bcdc72a4ae444705e2e0f3c416c1b28dadc7	2021-06-04 08:59:11 -07:00
albanD	d095ec75a1	Forward AD formulas batch 2 (#57863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57863 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D28387763 Pulled By: albanD fbshipit-source-id: e1b60ab728bb05b9e3323ee0dc7e401aaf5b8817	2021-06-03 07:33:04 -07:00
albanD	e9e5588588	Improve Tensor traverse to traverse its grad_fn when possible (#58271 ) Summary: There are two main changes here: - THPVariable will actually visit their grad_fn if there are no other reference to the c++ Tensor and no other reference to the grad_fn. The critical observation compared to the existing comment (thanks Ed!) is that if we also check that the c++ Tensor object is not referenced somewhere else, we're sure that no one can change the grad_fn refcount between the traverse and the clear. - THPVariable don't need a special clear for this new cases as we're the only owner of the c++ Tensor and so the cdata.reset() will necessarily free the Tensor and all its resources. The two tests are to ensure: - That the cycles are indeed collectible by the gc Pull Request resolved: https://github.com/pytorch/pytorch/pull/58271 Reviewed By: ngimel Differential Revision: D28796461 Pulled By: albanD fbshipit-source-id: 62c05930ddd0c48422c79b03118db41a73c1355d	2021-06-01 10:27:52 -07:00
Kyle Vedder	bbf92e6176	Add missing .to_sparse(ndim) gradient (#58413 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/46720, extends PR https://github.com/pytorch/pytorch/issues/46825 by adding test requested in [this comment](https://github.com/pytorch/pytorch/pull/46825#issuecomment-842304079). Pull Request resolved: https://github.com/pytorch/pytorch/pull/58413 Reviewed By: ailzhang Differential Revision: D28540550 Pulled By: albanD fbshipit-source-id: d7e292e09b5402336c43844ee233b83b0a095035	2021-05-20 15:08:34 -07:00
Jeffrey Wan	06c1094ea0	Merge CreationMeta MULTI_OUTPUT_SAFE with MULTI_OUTPUT_NODE (#58285 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/57679 ##### Release Notes This is part of the end of the deprecation of inplace/view: - `detach_` will now raise an error when invoked on any view created by `split`, `split_with_sizes`, or `chunk`. You should use the non-inplace `detach` instead. - The error message for when an in-place operation (that is not detach) is performed on a view created by `split`, `split_with_size`, and `chunk` has been changed from "This view is an output of a function..." to "This view is the output of a function...". Pull Request resolved: https://github.com/pytorch/pytorch/pull/58285 Reviewed By: bdhirsh Differential Revision: D28441980 Pulled By: soulitzer fbshipit-source-id: e2301d7b8cbc3dcdd328c46f24bcb9eb7f3c0d87	2021-05-17 13:48:39 -07:00
albanD	3c4a90ce38	Revert "Revert D28387764: Codegen inplace forward AD formula from out of place one if needed" (#58231 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58231 This reverts commit `066e7699eb`. Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D28412480 Pulled By: albanD fbshipit-source-id: 7a231aa81b9e89537e6dca19642c4f12cd4b5ea5	2021-05-13 13:18:16 -07:00
Jeffrey Wan	e71b526e7e	Add inference mode python bindings and tests (#58045 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/56608 - Adds binding to the `c10::InferenceMode` RAII class in `torch._C._autograd.InferenceMode` through pybind. Also binds the `torch.is_inference_mode` function. - Adds context manager `torch.inference_mode` to manage an instance of `c10::InferenceMode` (global). Implemented in `torch.autograd.grad_mode.py` to reuse the `_DecoratorContextManager` class. - Adds some tests based on those linked in the issue + several more for just the context manager Issues/todos (not necessarily for this PR): - Improve short inference mode description - Small example - Improved testing since there is no direct way of checking TLS/dispatch keys - Pull Request resolved: https://github.com/pytorch/pytorch/pull/58045 Reviewed By: agolynski Differential Revision: D28390595 Pulled By: soulitzer fbshipit-source-id: ae98fa036c6a2cf7f56e0fd4c352ff804904752c	2021-05-13 08:55:35 -07:00
Rong Rong (AI Infra)	002ce5c1df	port addmm to structure kernel (#57417 ) Summary: Port addmm to structure kernel Follow ups - migrate `mm` and `addbmm` to structure - move TORCH_CHECKS currently in `addmm_cpu_impl_` and `addmm_out_cuda_impl` to meta Pull Request resolved: https://github.com/pytorch/pytorch/pull/57417 Reviewed By: bdhirsh Differential Revision: D28291001 Pulled By: walterddr fbshipit-source-id: 4eafaa30a465e225fbb4d2a69a36f1e037df9122	2021-05-13 08:33:42 -07:00
mfkasim91	cf7d56d8f2	Gradgradcheck to runs successfully with unrelated inputs (#58049 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/57649 Pull Request resolved: https://github.com/pytorch/pytorch/pull/58049 Reviewed By: agolynski Differential Revision: D28390033 Pulled By: albanD fbshipit-source-id: a0809b918321f3ea6fc59bfbec1f37e566d3611d	2021-05-13 06:42:29 -07:00
Mike Ruberry	2d7d6922b6	Revert D28387765: Add forward AD gradcheck Test Plan: revert-hammer Differential Revision: D28387765 (`647282cb0c`) Original commit changeset: ed15049b5bda fbshipit-source-id: b47ac5de90da8fce3697a4d16aa10feea5668c99	2021-05-12 20:42:31 -07:00
albanD	647282cb0c	Add forward AD gradcheck (#57633 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57633 Test Plan: Imported from OSS Reviewed By: agolynski Differential Revision: D28387765 Pulled By: albanD fbshipit-source-id: ed15049b5bdacca54f775b50ef166d540ba0b847	2021-05-12 18:48:07 -07:00
lezcano	db13119fc4	Deprecate symeig (#57732 ) Summary: This one had a tricky usage of `torch.symeig` that had to be replaced. I tested the replacement locally though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57732 Reviewed By: bdhirsh Differential Revision: D28328189 Pulled By: mruberry fbshipit-source-id: 7f000fcbf2b029beabc76e5a89ff158b47977474	2021-05-12 02:21:35 -07:00
Nikita Vedeneev	c790fd2bf8	ATen lu_unpack. Required for making `torch.lu_solve` differentiable. (#46913 ) Summary: Backward methods for `torch.lu` and `torch.lu_solve` require the `torch.lu_unpack` method. However, while `torch.lu` is a Python wrapper over a native function, so its gradient is implemented via `autograd.Function`, `torch.lu_solve` is a native function, so it cannot access `torch.lu_unpack` as it is implemented in Python. Hence this PR presents a native (ATen) `lu_unpack` version. It is also possible to update the gradients for `torch.lu` so that backward+JIT is supported (no JIT for `autograd.Function`) with this function. ~~The interface for this method is different from the original `torch.lu_unpack`, so it is decided to keep it hidden.~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/46913 Reviewed By: albanD Differential Revision: D28355725 Pulled By: mruberry fbshipit-source-id: 281260f3b6e93c15b08b2ba66d5a221314b00e78	2021-05-11 22:53:21 -07:00
lezcano	415ae54c31	Deprecate torch.eig (#57727 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57727 Reviewed By: bdhirsh Differential Revision: D28317984 Pulled By: mruberry fbshipit-source-id: fa1aa1b78fd3611ac208bca93e2b745a1bac41f1	2021-05-10 23:31:02 -07:00
Jeffrey Wan	710a83d09f	Remove code and logic for old style custom autograd Function (#57357 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/30696 ### Release Notes Instantiating a custom autograd function is now deprecated. Users should call `.apply()` on the class itself because it is a static method. --end release notes-- - There are a couple error messages that we can't entirely remove because accessing these attributes of the autograd function instance may segfault (due to cdata being nullptr). Also added a TORCH_CHECK for the name attribute which previously segfaulted. - Error message updated to convey 1) old-style functions have been deprecated 2) this access pattern was once valid - Updates variable -> Tensor for some error messages Pull Request resolved: https://github.com/pytorch/pytorch/pull/57357 Reviewed By: mrshenli Differential Revision: D28193095 Pulled By: soulitzer fbshipit-source-id: f021b105e9a3fd4a20d6ee3dfb6a06a8c34b10ca	2021-05-10 10:26:06 -07:00
albanD	4fad8d1a2c	Update the default detach semantic for forward mode AD (#57820 ) Summary: This makes detach both forward and backward non-differentiable by default. You can pass the `only_backward_mode=True` argument to make it forward differentiable but backward non-differentiable. The important side effect of this change is that, by default, detach is not tracking any view information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57820 Reviewed By: ezyang Differential Revision: D28287633 Pulled By: albanD fbshipit-source-id: bdc4726fcd05889f6ac84e5a3a3ef71b2ec41015	2021-05-07 15:51:18 -07:00
Alexander	a911c4fc1c	New: Initial support for sparse complex tensors constructors for CPU/CUDA (#57125 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57125 I'm opening this PR, solving the last issued reported before merging PR #54153 https://github.com/pytorch/pytorch/pull/54153#issuecomment-827997616, Solves gh-50690 Test Plan: Imported from OSS Reviewed By: astaff Differential Revision: D28112702 Pulled By: ezyang fbshipit-source-id: 915681954edb14b7c19c3ffe641af2d2e6649576	2021-05-07 05:36:41 -07:00
Ilia Cherniavskii	8df9b88042	[kineto] Update Kineto submodule (#57700 ) Summary: Update Kineto submodule to fix an invalid json bug, also update and move profiler json tracing unit test Pull Request resolved: https://github.com/pytorch/pytorch/pull/57700 Test Plan: python test/test_profiler.py -v Reviewed By: gdankel, rohan-varma Differential Revision: D28243256 Pulled By: ilia-cher fbshipit-source-id: edfe9f26c66e967d610231be5fc22ba5ee1054fa	2021-05-05 20:09:38 -07:00
Alban Desmaison	15c092b888	Revert "Make grad mode error just a warning (#56401 )" (#57640 ) Summary: This reverts commit `63dac82444`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57640 Reviewed By: soulitzer, yuguo68 Differential Revision: D28223946 Pulled By: albanD fbshipit-source-id: 641b87cff1e2f08162ca8cacae333105e89438f1	2021-05-05 13:07:29 -07:00
Alexander	87242d2393	Eliminate global usage of torch.set_default_dtype in test_autograd (#56446 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56446 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D28000589 Pulled By: mruberry fbshipit-source-id: c8fb2907d656138e72ecf8fb3e572591f8972900	2021-05-02 22:13:33 -07:00
Ivan Yashchuk	eaf00bf7d4	Skip linalg.qr saved mode check if compiled without LAPACK (#56284 ) Summary: This PR also removes qr and eig tests from test/test_torch.py. They were not skipped if compiled without LAPACK and they are now replaced with OpInfos. Fixes https://github.com/pytorch/pytorch/issues/55929 Pull Request resolved: https://github.com/pytorch/pytorch/pull/56284 Reviewed By: ejguan Differential Revision: D27827077 Pulled By: mruberry fbshipit-source-id: 1dceb955810a9fa34bb6baaccbaf0c8229444d3a	2021-05-02 16:07:07 -07:00
Peter Bell	7c8d0069c4	grad_fn getter for optional strings (#55225 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55225 Test Plan: Imported from OSS Reviewed By: astaff Differential Revision: D28118113 Pulled By: mruberry fbshipit-source-id: 711723922cff3afa220e03d926cee5884e167706	2021-05-01 17:39:17 -07:00
albanD	95dc2b6e9b	Remove unused forward AD flag (#57058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57058 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D28071504 Pulled By: albanD fbshipit-source-id: df694ac6b9fbb4aed269d61cd9522f8602fdae0c	2021-04-30 07:32:56 -07:00
Kevin Rose	ec86f96e91	Fix for derivative of sinc(x) when x is positive but very very small (#56986 ) Summary: Problem arises for sinc'(x) where x != 0, but x ** 2 == 0, which happens for some very small floats. I realized that my solution from https://github.com/pytorch/pytorch/issues/56763 was incomplete when I did a quick implementation using `torch.autograd.Function` and still got a `NaN` from my derivative. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56986 Reviewed By: gchanan Differential Revision: D28093507 Pulled By: albanD fbshipit-source-id: 2a30e1065b08c5c60de843a0778dedeb0fb295f4	2021-04-29 11:16:39 -07:00
Ilia Cherniavskii	77721ee318	[profiler] Add cuda synchronization point (ci-all) (#57036 ) Summary: Adding cuda synchronization when exiting the profiler context manager Pull Request resolved: https://github.com/pytorch/pytorch/pull/57036 Test Plan: CI Reviewed By: xuzhao9 Differential Revision: D28040552 Pulled By: ilia-cher fbshipit-source-id: 944c46a58f4c2b6d1a1c64c8d4012d662d0262d0	2021-04-28 01:17:28 -07:00
Mike Ruberry	7bcce2acb9	Revert D27765618: Initial support for sparse complex tensors constructors for CPU/CUDA Test Plan: revert-hammer Differential Revision: D27765618 (`daef60c3b7`) Original commit changeset: a9cdd31d5c7a fbshipit-source-id: f700d5db7ff8930b9158460b5a77f68a35e212a4	2021-04-27 15:48:51 -07:00
Alexander	daef60c3b7	Initial support for sparse complex tensors constructors for CPU/CUDA (#54153 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54153 Currently, sparse tensors only support real floating point tensors. Complex support is added in this PR for CPU/CUDA. - [x] add complex support (torch.cfloat and torch.cdouble) to torch.sparse_coo_tensor constructors - [x] add complex support to coalesce function - [x] add complex support to to_dense function - [x] add complex support to to_sparse function - [x] add complex support to sparse_add function - [x] add unit tests Note: This PR contains only complex support for torch.sparse_coo_tensor fordward function and the related ops used with this function (coalesce, to_dense, to_sparse, and sparse_add). The following PRs in ghstack should cover other sparse operations to have a more complex sparse support, specifically related with the use of specific APIs for accelerated linear algebra. Note: Before using ghstack the original PR was #50984 Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D27765618 Pulled By: ezyang fbshipit-source-id: a9cdd31d5c7a7dafd790f6cc148f3df26e884c89	2021-04-27 14:39:13 -07:00
Ilia Cherniavskii	c203c921bc	Revert D27926270: [pytorch][PR] [profiler] Add cuda synchronization points Test Plan: revert-hammer Differential Revision: D27926270 (`38bb0ac3e8`) Original commit changeset: 5cf30128590c fbshipit-source-id: 940da27f5c921d8921191188230807f1708e3e1f	2021-04-27 09:27:35 -07:00
Jeffrey Wan	7fe6e8e5a2	Refactor C->C to C->R twice (#55692 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55692 ### Release notes get_numerical_jacobian and get_analytical_jacobian only support `grad_out=1` and `fn` no longer accepts functions that return complex output Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D28004614 Pulled By: soulitzer fbshipit-source-id: 9592c9c69584b4035b39be62252f138dce39d3b5	2021-04-27 07:53:13 -07:00
anjali411	268cc117a8	Add OpInfos for torch.{complex, view_as_real, view_as_complex} (#56524 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56524 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D27909165 Pulled By: anjali411 fbshipit-source-id: 38592cdb357386549c8309792ef7c3218665d286	2021-04-27 07:40:46 -07:00
Ilia Cherniavskii	38bb0ac3e8	[profiler] Add cuda synchronization points (#56651 ) Summary: Adding cuda synchronization when entering and exiting the profiler context manager Pull Request resolved: https://github.com/pytorch/pytorch/pull/56651 Test Plan: CI Reviewed By: gdankel Differential Revision: D27926270 Pulled By: ilia-cher fbshipit-source-id: 5cf30128590c1c71a865f877578975c4a6e2cb48	2021-04-26 23:21:05 -07:00
Kevin Rose	5854e93bc9	Fix derivative of sinc at x=0 (#56763 ) Summary: Attempting to fix https://github.com/pytorch/pytorch/issues/56760 The derivative of `sinc(x)` at `x=0` should be special cased to 0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56763 Reviewed By: zhangguanheng66 Differential Revision: D27978135 Pulled By: albanD fbshipit-source-id: ede5e734613cf60e720f6bcc7387c3cd9c6ec233	2021-04-26 09:43:42 -07:00
albanD	3ddcc8d833	Add more test cases for cdist OpInfo and TODOs (#56604 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56604 Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27939203 Pulled By: albanD fbshipit-source-id: 197de148ba00d217eb0bfc5b5724d23cf6de0910	2021-04-23 14:08:17 -07:00
Jeffrey Wan	2078836005	Clean up raise exception logic (#55656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55656 ### For release notes What: - All errors that are silenced by "raise_exception=False" are now GradcheckError (which inherits from RuntimeError). Why: - Due to a refactor of gradcheck Workaround: - If you catch for 'RuntimeError' with `except RuntimeError`, since GradcheckError inherits from RuntimeError, no changes are necessary. However if you explicitly check for the errors type via `type(error)`, you'll need to update your code to check for `GradcheckError` instead. Factors out all the logic handling involving `fail_test`, `raise_exception` into 1) a wrapper around gradcheck that uses try/except 2) gradcheck_helper that always raises exception. This allows us to avoid having to write the `if not x: return False` logic that is scattered throughout gradcheck currently. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27920809 Pulled By: soulitzer fbshipit-source-id: 253aef6d9a3b147ee37a6e37a4ce06437981929a	2021-04-22 19:46:39 -07:00
Jeffrey Wan	2128a84a69	Fix grad_fn bindings when saved variable freed (#56499 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54472 Adds HANDLE_TH_ERRORS to python bindings for grad_fn attrs and updates tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56499 Reviewed By: albanD Differential Revision: D27920742 Pulled By: soulitzer fbshipit-source-id: d4f7ac8c0aa2173d25517277c393f8c66de68951	2021-04-22 13:40:40 -07:00
anjali411	76214bb464	Add OpInfo for torch.baddbmm (#56502 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56502 Reviewed By: heitorschueroff Differential Revision: D27890939 Pulled By: anjali411 fbshipit-source-id: 072647a05cf93aedb76df0367af71b534be77258	2021-04-22 07:00:52 -07:00
anjali411	062e70590c	Add OpInfo tests for torch.{dot, vdot, bmm, mv} (#56409 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56409 Reviewed By: nikithamalgifb Differential Revision: D27870769 Pulled By: anjali411 fbshipit-source-id: a1a0e89856529a4739c7612c5b1e3c5ed2569126	2021-04-20 10:22:15 -07:00
Alban Desmaison	63dac82444	Make grad mode error just a warning (#56401 ) Summary: Temporary fix to give people extra time to finish the deprecation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56401 Reviewed By: xw285cornell, drdarshan Differential Revision: D27862196 Pulled By: albanD fbshipit-source-id: ed460267f314a136941ba550b904dee0321eb0c6	2021-04-20 06:30:55 -07:00
Winston Smith	b6b2fc7e3f	Added OpInfos of add & mm (#55915 ) Summary: Added `OpInfo`s of `add` & `mm`. cc anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/55915 Reviewed By: agolynski Differential Revision: D27800077 Pulled By: heitorschueroff fbshipit-source-id: 84be4b0930f6ef472622e6721a516cc182ac76d1	2021-04-19 08:56:19 -07:00
Jeffrey Wan	d312aeb6ac	Implement faster gradcheck but not enabled for most things (#54480 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54480 This PR shouldn't really change the behavior of gradcheck for most ops. However, the changes in test_autograd allow us to run basic checks for both fast and slow (instead of previously just slow). All it should be doing is wrapping the preexisting tests we introduced in prior PRs in a function which takes `fast_mode` as a param. We then call this function twice, once with `fast_mode=True` and once with `fast_mode=False`. Plan for rollout: - This PR should only land the code (and runs some basic checks as described above). - This should help us verify that a) slow is still working as expected b) basic functionality of fast works - After we land this, but before we run the next PR in the stack, we should land https://github.com/pytorch/pytorch/pull/55182. This is to ensure that there is no gap where the slow tests aren't running. - The next PR is responsible for enabling the fast_mode=True flag on all tests (where the function has real inputs/outputs), and selectively disabling for the cases the fail. - Finally in a later PR, we reenable fast-gradcheck for functions w/ complex inputs/outputs TODOs and open questions (not necessarily blocking this PR): - ~How do we think about atol/rtol~ (scale atol, keep rtol as-is) - ~reenable fast-gradcheck for complex numbers~ - ~when inputs are uncoalesced we don't truly test this case because we coalesce the inputs before calling function. Revisit this when https://github.com/pytorch/pytorch/pull/52874/files is landed~ ### Developer Experience Sample output when jacobian mismatch occurs: ``` Traceback (most recent call last): File "/home/s/local/pytorch4/test/test_autograd.py", line 4220, in test_gradcheck_jacobian_mismatch check(fast_mode=True) File "/home/s/local/pytorch4/test/test_autograd.py", line 4196, in check gradcheck(fn, (x,), fast_mode=fast_mode) File "/home/s/local/pytorch4/torch/testing/_internal/common_utils.py", line 2067, in gradcheck return torch.autograd.gradcheck(fn, inputs, **kwargs) File "/home/s/local/pytorch4/torch/autograd/gradcheck.py", line 1020, in gradcheck if not fast_gradcheck(fail_test, seeded_func, func_out, tupled_inputs, outputs, eps, rtol, File "/home/s/local/pytorch4/torch/autograd/gradcheck.py", line 915, in fast_gradcheck return fail_test(get_notallclose_msg(a, n, i, j, prefix) + jacobians_str) File "/home/s/local/pytorch4/torch/autograd/gradcheck.py", line 996, in fail_test raise RuntimeError(msg) RuntimeError: Jacobian mismatch for output 0 with respect to input 0, numerical:tensor(0.9195) analytical:tensor(0.9389) The above quantities relating the numerical and analytical jacobians are computed in fast mode. See: https://github.com/pytorch/pytorch/issues/53876 for more background about fast mode. Below, we recompute numerical and analytical jacobians in slow mode: Numerical: tensor([[1.0000, 0.0000, 0.0000, 0.0000], [0.0000, 1.0000, 0.0000, 0.0000], [0.0000, 0.0000, 1.0000, 0.0000], [0.0000, 0.0000, 0.0000, 1.0000]]) Analytical: tensor([[1.0100, 0.0100, 0.0100, 0.0100], [0.0100, 1.0100, 0.0100, 0.0100], [0.0100, 0.0100, 1.0100, 0.0100], [0.0100, 0.0100, 0.0100, 1.0100]]) The max per-element difference (slow mode) is: 0.010000000000054632. ``` Additionally, if the per-element difference is small i.e., `allclose(analytical_slow, numerical_slow, rtol, atol) is True` we follow up with this message: ``` Fast gradcheck failed but element-wise differences are small. This means that the test might've passed in slow_mode! If you are adding a new operator, please file an issue and then use one of the workarounds. The workaround depends on how your test invokes gradcheck/gradgradcheck. If the test - manually invokes gradcheck/gradgradcheck, then call gradcheck/gradgradcheck with `fast_mode=False` as a keyword argument. - is OpInfo-based (e.g., in test_ops.py), then modify the OpInfo for the test to have `gradcheck_fast_mode=False` - is a Module test (e.g., in common_nn.py), then modify the corresponding module_test entry to have `gradcheck_fast_mode=False` ``` Test Plan: Imported from OSS Reviewed By: walterddr, ejguan Differential Revision: D27825160 Pulled By: soulitzer fbshipit-source-id: 1fe60569d8b697c213b0d262a832622a4e9cf0c7	2021-04-16 15:03:18 -07:00
Jeffrey Wan	dd8bfe2b93	Finish deprecation cycle for inplace view error checks (#56093 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50617 Also updates the relevant tests to expect errors instead of warnings Pull Request resolved: https://github.com/pytorch/pytorch/pull/56093 Reviewed By: agolynski Differential Revision: D27806795 Pulled By: soulitzer fbshipit-source-id: 93c5c28edb1f97fa4457332c2ef4711f050ac81f	2021-04-16 10:44:58 -07:00
anjali411	119b3eccda	Revert "Revert D27598681: Add OpInfo tests for torch.addbmm" (#55908 ) Summary: This reverts commit `fd450ff1b9`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55908 Reviewed By: agolynski Differential Revision: D27800571 Pulled By: anjali411 fbshipit-source-id: f04144afe7768872acb3fc2f5f242bb0093abc5e	2021-04-16 10:01:43 -07:00
Xue Haotian	63f83edcfb	OpInfo porting for torch.real & torch.imag (#55134 ) Summary: Related https://github.com/pytorch/pytorch/issues/54298 This PR ports the method_tests() entries of torch.real & torch.imag to OpInfo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55134 Reviewed By: agolynski Differential Revision: D27793242 Pulled By: anjali411 fbshipit-source-id: 0e9a987bfef16e78a1cda81ce14970993a59e467	2021-04-15 13:28:21 -07:00
albanD	1d49fd31c4	[reland] Add formulas and basic tests (#56083 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/49098 See original issue for details. The only difference with previous PR is the fix of the _embedding_bag_dense_backward formula to stop declaring a backward formula for an argument that does not exists. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56083 Reviewed By: samestep Differential Revision: D27778221 Pulled By: albanD fbshipit-source-id: 159ef91ca931ef2ccfbc3d1c46c7880c32919dc9	2021-04-15 07:52:43 -07:00
Sam Estep	817fd932ac	Revert D25607505: Add formulas and basic tests Test Plan: revert-hammer Differential Revision: D25607505 (`70f5905565`) Original commit changeset: fe2315d58768 fbshipit-source-id: 519d7426a6f32f0db51c4f360e5d5a79dbaac99d	2021-04-14 14:50:43 -07:00
albanD	70f5905565	Add formulas and basic tests (#49098 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49098 RFC: https://github.com/pytorch/rfcs/pull/11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level / 0, / is_inplace_op / false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level / 0, / is_inplace_op / false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, , MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level / 0, / is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25607505 Pulled By: albanD fbshipit-source-id: fe2315d587689af1cd5968536fa26c680b8b8829	2021-04-14 14:13:30 -07:00
Jeffrey Wan	381b3d8f4b	Refactor get numerical jacobian to calculate wrt all outputs at once (#54378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54378 ### For release notes `torch.autograd.gradcheck.get_numerical_jacobian` (not part of the public api) is being deprecated. In the future, user code relying on this function will break because, among other changes, `get_numerical_jacobian` now returns `List[Tuple[torch.Tensor]]` instead of `List[torch.Tensor]`. (more details if necessary) For a `fn` that takes in M inputs and N outputs we now return a list of M N-tuples of jacobians where `output[i][j]` would represent the numerical jacobian w.r.t. to the ith input and the jth output. Previously `get_numerical_jacobian` returned a list of tensors where each tensor represents the jacobian w.r.t. to each of the M inputs and a specific output. Finally, the function passed in as the parameter `fn` should expect to handle individual parameters, where previously `fn` is required to expect its parameters wrapped in a tuple. --- end -- This PR addresses the comment here https://github.com/pytorch/pytorch/pull/53857#discussion_r595429639, to reduce the run-time of old gradcheck's get numerical jacobian by a factor of num_outputs. However, because very few ops actually return multiple outputs, there is not too much real speed up here. The main benefit of doing this change as part of the refactor is that it helps us isolate the possible bugs that are specific to switching `get numerical jacobian` to run in a per output way vs all outputs at once. Much of the logic implemented here will be the same for the fast gradcheck case, so knowing for certain that everything should pass after this stage will make the next step much simpler. The get_numerical_jacobian api is also being used in common_nn. So we update the callsite there as well. Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D27728720 Pulled By: soulitzer fbshipit-source-id: ee0f90b4f26ddc5fdbe949c4965eaa91c9ed0bb8	2021-04-13 10:06:20 -07:00
albanD	505f6f325f	port addcdiv to opinfo (#55518 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55518 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D27649411 Pulled By: albanD fbshipit-source-id: cfb0a235d94ef62589acbeb9bf11d2ea17248484	2021-04-13 06:21:10 -07:00
albanD	9ccae89102	port addcmul to OpInfo (#55517 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55517 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D27649413 Pulled By: albanD fbshipit-source-id: e1faf25cf7f9c3636f62db1512aee78fd7c4f9b6	2021-04-13 06:19:33 -07:00
Peter Bell	566e06eb9b	Use _WeakTensorRef over weakref in test_autograd.py (#55726 ) Summary: There are a few autograd tests checking for tensors leaked by reference cycles. This changes them to use `_WeakTensorRef` over `weakref`. `_WeakTensorRef`, added in https://github.com/pytorch/pytorch/issues/52874, accesses the C++ level `TensorImpl` reference count, compared to `weakref` which accesses python refcounts and so can only tell if the python wrapper object gets deallocated. Not only is this less code, it's also more accurately detecting that the Tensor itself is deallocated. I didn't touch `weakref` usage in [test_anomaly_assign_parent_cleanup](`fc349cbcde/test/test_autograd.py (L3733)`) and [test_nested_anomaly_printstack_cleanup](`fc349cbcde/test/test_autograd.py (L3772)`) because these are intentionally testing for python object cleanup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55726 Reviewed By: ngimel Differential Revision: D27718526 Pulled By: albanD fbshipit-source-id: 37a4914360e35dd4ae8db06b29525cebec4d4b84	2021-04-12 14:16:02 -07:00
lezcano	211d31afc9	symeig supports complex backward (#55085 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/53651 I did not put much effort in improving the docs, as I will go over all these docs in future PRs cc anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/55085 Reviewed By: nikithamalgifb Differential Revision: D27493604 Pulled By: anjali411 fbshipit-source-id: 413363013e188bc869c404b2d54ce1f87eef4425	2021-04-12 09:45:50 -07:00
Yukio Siraichi	93bf0ae6fc	Remove legacy constructor calls from pytorch codebase. (#54142 ) Summary: Follow up from https://github.com/pytorch/pytorch/issues/53889 Related to https://github.com/pytorch/pytorch/issues/47112 Removing every occurrence of the legacy constructor call present in PyTorch at: - _docs_ - _benchmarks_ - _test_ - _caffe2_ - _CONTRIBUTING.md_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/54142 Reviewed By: ngimel Differential Revision: D27699450 Pulled By: mruberry fbshipit-source-id: 530aa3f5746cc8bc1407d5d51b2bbd8075e30546	2021-04-11 15:45:17 -07:00
Alexander	6ee333cdb5	modernize test_sparse (#54572 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54572 Adding device generic tests to `test_sparse`. Follow-up PR: #54153 I think is ready to review. Looking forward your comments cc mruberry. Thanks Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27562663 Pulled By: mruberry fbshipit-source-id: c48973e707f779b529bc7f61b75103194b428987	2021-04-09 12:19:29 -07:00
Brian Hirsh	fd450ff1b9	Revert D27598681: Add OpInfo tests for torch.addbmm Test Plan: revert-hammer Differential Revision: D27598681 (`b5647dd52b`) Original commit changeset: 24082f54b12e fbshipit-source-id: 43d5713829fbaa00353bb7b054b66f537d768cd1	2021-04-08 11:38:49 -07:00
anjali411	b5647dd52b	Add OpInfo tests for torch.addbmm (#55378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55378 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D27598681 Pulled By: anjali411 fbshipit-source-id: 24082f54b12e6346b81c9b6a6e20714e8fd94a9b	2021-04-08 05:48:23 -07:00
Alban Desmaison	b91d48877d	Reland Fix reference cycle in sparse coalesce graph (#55404 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/52874 Pull Request resolved: https://github.com/pytorch/pytorch/pull/55404 Reviewed By: bdhirsh Differential Revision: D27600438 Pulled By: albanD fbshipit-source-id: f5c286638b324ad59be65657a016028af5e2b303	2021-04-07 12:02:42 -07:00
Brian Hirsh	ec80981d28	Revert D27246997: [pytorch][PR] Fix reference cycle in sparse coalesce graph Test Plan: revert-hammer Differential Revision: D27246997 (`815bfad28c`) Original commit changeset: 0fe6c1104350 fbshipit-source-id: 4d345718589a642d3c65474b266342285205ccdf	2021-04-06 11:45:27 -07:00
Peter Bell	815bfad28c	Fix reference cycle in sparse coalesce graph (#52874 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/52253 In the issue reproducer we can replace `torch.sparse.sum(S)` with `S.coalesce()` and get the same memory leak. The reason is that calling `coalesce()` on an already coalesced tensor returns `self`. With autograd, the result gets it's `grad_fn` set to a node that contains a reference to the input tensor, creating a reference cycle. Cloning the tensor fixes this, so `coalesce` always returns a new tensor. As an aside, `torch.sparse.sum(S)` doesn't need to coalesce. The result should be the same either way. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52874 Reviewed By: bdhirsh Differential Revision: D27246997 Pulled By: albanD fbshipit-source-id: 0fe6c11043501a7874a50982afd42964f47470d3	2021-04-06 08:32:19 -07:00
lezcano	fd02fc5d71	Port put_ and take from TH to ATen (#53356 ) Summary: The two ports were don together, as they can be implemented with the same kernel. In TH, they were already implemented with the same kernel. Resolves https://github.com/pytorch/pytorch/issues/24751 Resolves https://github.com/pytorch/pytorch/issues/24614 Resolves https://github.com/pytorch/pytorch/issues/24640 Resolves https://github.com/pytorch/pytorch/issues/24772 This port makes sure that it interacts correctly with the "deterministic algorithms" flag, as done in https://github.com/pytorch/pytorch/pull/51388 This PR also makes these two functions correct in the following aspects (all of them added to the tests as well): - Support for complex numbers - Correct handling of scalar inputs and zero-dimensional inputs - Implementation that does not do any copies nor sorting of any of the input tensors - Faster and more correct implementation of the backwards (now it works as it should when `source.shape() != index.shape()`) - Now `put_(..., accumulate=True)` is implemented correctly with atomic operations on GPU / CPU (when possible) and is deterministic (modulo the loss of precision that might happen due to the reordering of a sum of floats) - Adds the `torch.put` function that was missing, (`index_put` exists, for example) - Corrected docs It also adds a much more thorough testing to the operations and their gradients. There is a BC-breaking change, and that is that now we check that the inputs do not overlap in the `put_` operation. This was handled (some of the cases, other cases were wrong) in the TH implementation by making contiguous copies of the inputs. How should we handle this one? Edit. Benchmarks: <details> <summary>Script</summary> ```python from IPython import get_ipython import torch from itertools import product torch.manual_seed(13) torch.set_num_threads(1) ipython = get_ipython() cpu = torch.device('cpu') cuda = torch.device('cuda') def run_test(ndims, size, index_len, device, cmd): print(f"cmd: {cmd}, ndims: {ndims}, tensor_size: {size}, index_len: {index_len}, device: {device}") large_tensor = torch.rand(([size] ndims), device=device) small_tensor = torch.rand((index_len,), device=device) index = torch.randint(size * ndims, (index_len,), dtype=torch.long, device=device) if cmd == "put": command = "large_tensor.put_(index, small_tensor, accumulate=False)" if device == cuda: command += "; torch.cuda.synchronize()" elif cmd == "accumulate": command = "large_tensor.put_(index, small_tensor, accumulate=True)" if device == cuda: command += "; torch.cuda.synchronize()" elif cmd == "take": command = "torch.take(large_tensor, index)" if device == cuda: command += "; torch.cuda.synchronize()" ipython.magic(f"timeit {command}") print() for method, device in product(["accumulate", "put", "take"], [cpu, cuda]): run_test(3, 1000, 10, device, method) run_test(3, 1000, 1000, device, method) run_test(3, 1000, 10000, device, method) run_test(2, 10000, 100000, device, method) ``` </details> ```python put_(accumulate=False) ``` <details> <summary>ATen CPU (1.5x - 2x speedup)</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.05 µs ± 2.35 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 3.15 µs ± 5.13 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 21.6 µs ± 13.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 238 µs ± 781 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>TH CPU</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 722 ns ± 2.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 4.89 µs ± 18.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 42.5 µs ± 96.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 428 µs ± 774 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>ATen GPU (same speed)</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 8.99 µs ± 16 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 10.4 µs ± 24.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 10.4 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 15.6 µs ± 1.12 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> <details> <summary>TH GPU</summary> ```python cmd: put, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 8.44 µs ± 31.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 9.09 µs ± 4.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 9.77 µs ± 0.998 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: put, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 15.8 µs ± 5.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> ```python put_(accumulate=True) ``` <details> <summary>ATen CPU (x2 speedup)</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.12 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 3.14 µs ± 2.05 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 20.8 µs ± 25.9 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 264 µs ± 263 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>TH CPU</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 814 ns ± 1.87 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 5.11 µs ± 6.02 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 43.9 µs ± 49.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 442 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` </details> <details> <summary>ATen GPU (3x - 11x speedup)</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 9.01 µs ± 14.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 10.4 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 10.3 µs ± 44.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 12.6 µs ± 19 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> <details> <summary>TH GPU</summary> ```python cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 34.7 µs ± 131 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 38.2 µs ± 116 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 61.2 µs ± 50.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) cmd: accumulate, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 140 µs ± 24.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` </details> ```python take() ``` <details> <summary>ATen CPU (1.1x speedup)</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.18 µs ± 2.34 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 2.79 µs ± 2.96 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 16.6 µs ± 10.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 161 µs ± 984 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` </details> <details> <summary>TH CPU</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cpu 1.1 µs ± 3.14 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cpu 2.93 µs ± 7.31 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cpu 18.6 µs ± 14.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cpu 178 µs ± 139 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ``` </details> <details> <summary>ATen GPU (same speed)</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 9.38 µs ± 23.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 10.7 µs ± 9.77 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 10.6 µs ± 107 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 11.5 µs ± 21.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> <details> <summary>TH GPU</summary> ```python cmd: take, ndims: 3, tensor_size: 1000, index_len: 10, device: cuda 9.31 µs ± 7.57 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 1000, device: cuda 9.52 µs ± 5.78 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 3, tensor_size: 1000, index_len: 10000, device: cuda 9.73 µs ± 17.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) cmd: take, ndims: 2, tensor_size: 10000, index_len: 100000, device: cuda 11.7 µs ± 5.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` </details> cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/53356 Reviewed By: mruberry Differential Revision: D27520243 Pulled By: ngimel fbshipit-source-id: e3979349c2c62d2949e09fb05e5fd4883fbc9093	2021-04-05 18:05:38 -07:00
Peter Bell	2ee02b30b1	Replace rounding_mode="true" with rounding_mode=None (#51988 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51988 * #51988 Replace rounding_mode="true" with rounding_mode=None Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27561817 Pulled By: mruberry fbshipit-source-id: 60d1d9c389570f60d599fc1876518717367fb368	2021-04-05 14:53:43 -07:00
Heitor Schueroff	3036777305	Replace torch.chain_matmul calls to torch.linalg.multi_dot (#55064 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55064 Test Plan: Imported from OSS Reviewed By: glaringlee Differential Revision: D27469261 Pulled By: heitorschueroff fbshipit-source-id: 4a53cb058babc81f93f159747b4ed2b6c985a0bc	2021-04-01 04:50:53 -07:00
Heitor Schueroff	d98072b027	Deprecate torch.chain_matmul in favor of torch.linalg.multi_dot (#53453 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53453 Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D27406282 Pulled By: heitorschueroff fbshipit-source-id: b6e715d1b88e0613ee6b6208cb28ba4757e31717	2021-04-01 04:50:51 -07:00
anjali411	7c8b0f2600	Test torch.chain_matmul for complex dtype (#54885 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54885 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D27400936 Pulled By: anjali411 fbshipit-source-id: 415d843d7c55f4d84a8e9faab926a4895e1544d0	2021-03-29 13:37:23 -07:00
Ilqar Ramazanli	d59fb7a2f6	Add complex autograd support for `torch.unfold` (#52999 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/51875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52999 Reviewed By: H-Huang Differential Revision: D26735206 Pulled By: iramazanli fbshipit-source-id: ee134461e97079722a79f89737a7f0d2b620c2c8	2021-03-27 08:21:28 -07:00
anjali411	3ed6e0ce6c	Remove ops from the complex_list for which the method_tests have been ported (#54754 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54754 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D27354326 Pulled By: anjali411 fbshipit-source-id: 745cbc24b885f7d9263fa8796279200518e56edb	2021-03-26 12:09:28 -07:00
Jeffrey Wan	673ed4623e	Gradcheck small fixes (#53916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53916 This PR fixes some bugs that are made more clear by the previous refactor. - make sure gradcheck returns false when its supposed to fail and when raise_exception=False. - make sure when test_batched_grad fails, it returns false when raise_exception=False Removing checkIfNumericalAnalyticAreClose made sense here to me because underneath its really doing `torch.allclose`, and using that directly instead of adding another opaque function to call seemed to make the code more clear. TODO: - ~add a test to see if when torch.allclose fails, we indeed return false.~ - ~uncomment test from previous PR.~ Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D27201692 Pulled By: soulitzer fbshipit-source-id: 8b8dc37c59edb7eebc2e8db6f8839ce98a81d78b	2021-03-24 14:35:40 -07:00
Jeffrey Wan	796be045bb	Refactor gradcheck (#53857 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53857 This PR basically just factors a lot of the logic out from the main gradcheck function into their own individual functions. It aims to avoid any behavior change (but we may not have enough tests to actually verify this). Refactorings that lead to any behavior chang are done in the next PR in this stack. The rationale for this change is 1) to make the main gradcheck function cleaner to read, and 2) also allow us to reuse the same pieces when we add the fast gradcheck. Maybe this PR is also a good place to add some tests for gradcheck, i.e., make sure gradcheck fails when it should fail, as to make sure that we are indeed not changing any logic. This will also help us make sure our fast_gradcheck does all the necessary checks: So far existing tests are: - test_gradcheck_fail_when_no_differentiable_outputs_and_num_grad_not_zero` (test_autograd) - test_gradcheck_single_input (test_autograd) - test_gradcheck_sparse_input (test_autograd) - test_gradcheck_nondeterministic (test_autograd) - test_gradcheck (test_overrides) Full coverage would potentially require adding the following missing tests (for each test for both raise_exception=True/False) - Methodology for getting the list below is that for every type of error message we spit out, we make sure we can hit it: - complex: - when numerical != analytical when tested with imag grad_out - check_inputs - ~when inputs are not dense, but check_sparse_nnz is false~ - ~when none of the inputs require grad~ - ~(warning) when inputs are not double precision~ - ~when layout is not mkldnn(aka has strides) and input has a dimension with stride 0.~ - check_no_differentiable_outputs: - ~when none of the outputs are differentiable, but numerical gradient is not zero~ - check_outputs: - ~when sparse outputs (always raise)~ - ~when mkldnn outputs (always raise)~ - test_batched_grad - ~when encounter runtime error while computing batched grad (print big message)~ - when not allclose (print out big message) - test_backward_mul_by_grad_output - ~when layout of grad_input is not the same as input~ - ~when grad_input is sparse and has incorrect sparse_dim/dense_dim~ - ~when backward not multiplied by grad_output (sparse/non-sparse case)~ - when grad is incorrect type/size - test_undefined_grad - ~when encounter runtime error while running backward~ - when we complete backward but grad inputs (the output of .grad()) is not none - check_analytical_jacobian_attributes (for both complex/non complex) - when grad input is incorrect dtype/size Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D27201571 Pulled By: soulitzer fbshipit-source-id: 86670a91e65740d57dd6ada7c6b4512786d15962	2021-03-24 14:34:08 -07:00
Pritam Damania	4fa47e5e7d	Support non-tensor inputs and outputs for checkpointed functions. (#52422 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52422 As mentioned in https://github.com/pytorch/pytorch/issues/52415, `torch.utils.checkpoint` doesn't support checkpointing for functions which have non-tensor inputs and outputs. This PR resolves this issue by ensuring the autograd machinery ignores the non-tensor inputs and outputs and processes the tensors accordingly. ghstack-source-id: 124406867 Test Plan: 1) unit test 2) waitforbuildbot Reviewed By: albanD Differential Revision: D26507228 fbshipit-source-id: 0a5a1591570814176185362e83ad18dabd9c84b0	2021-03-19 21:29:03 -07:00
albanD	a425eb2135	Add size check for forward grads (#54100 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54100 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D27117842 Pulled By: albanD fbshipit-source-id: ccb6abac38d7fca31bea72cbbf3bba38c6030c37	2021-03-18 09:28:56 -07:00
Jeffrey Wan	7297556d5d	Add support for single tensor in `inputs` argument for backward (#53827 ) Summary: Also updates the doc such that the language matches the type. For example, previously the `tensors` argument is specified as `(sequence of tensor)`, but has type annotation of `_TensorOrTensors`. Now its correctly updated to be `Sequence[Tensor] or Tensor` Pull Request resolved: https://github.com/pytorch/pytorch/pull/53827 Reviewed By: albanD Differential Revision: D26997541 Pulled By: soulitzer fbshipit-source-id: e1e609a4e9525139d0fe96f6157175481c90d6f8	2021-03-12 08:19:31 -08:00
ilqar	f364e492df	Autograd functional API should enable_grad (#47543 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/44640 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47543 Reviewed By: albanD Differential Revision: D26965136 Pulled By: iramazanli fbshipit-source-id: 1dd46b9402bb670c0e165db684712e26c1a2036f	2021-03-11 07:41:31 -08:00
Nikita Vedeneev	8f15a2f052	eig_backward: faster and with complex support (#52875 ) Summary: As per title. Compared to the previous version, it is lighter on the usage of `at::solve` and `at::matmul` methods. Fixes https://github.com/pytorch/pytorch/issues/51621 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52875 Reviewed By: mrshenli Differential Revision: D26768653 Pulled By: anjali411 fbshipit-source-id: aab141968d02587440128003203fed4b94c4c655	2021-03-10 11:33:30 -08:00
Jeffrey Wan	a3c3141dd2	Fix gradfn attr bindings when saved variable is of an output (#53205 ) Summary: When saved variable is of an output, its grad_fn is not saved in SavedVariable, so it must be passed in during `unpack`. Here, we can always pass in grad_fn (whether or not saved variable is an output) because it is ignored if the saved variable is not an output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53205 Reviewed By: gchanan, zhangguanheng66 Differential Revision: D26794365 Pulled By: soulitzer fbshipit-source-id: e039baba20c364c4ab42ff99d0b242dd95c67fb3	2021-03-04 16:59:42 -08:00
Jeffrey Wan	a3a2150409	Codegen python bindings to access attributes of grad_fn (#52451 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/9922 Adds python bindings to selected fields that grad_fn saves - we did not add python bindings to certain types such as 'TypeAndSize' and 'TensorGeometry'. All field names are prefixed with `_saved_` so they are easy to discern. User code should not depend on particular saved fields to exist as what grad_fn saves for the backward pass is considered an implementation detail and thus prone to change. Warning: Not all parameters that are passed in are necessarily stored to be used for the backward pass. What you put in is not necessarily what you get out either. Here we pass `kernel_size=3`, but `b.grad_fn._saved_kernel_size` returns `(3, 3)` instead of 3. It seems to vary case-by-case. For example: ``` import torch import torch.nn as nn model = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, stride=2, padding=1, dilation=1) a = torch.ones(1, 3, 32, 32, requires_grad=True) b = model(a) print("kernel_size: ", b.grad_fn._saved_kernel_size) print("stride: ", b.grad_fn._saved_stride) # returns tuple: (3, 3) # print("dilation: ", b.grad_fn._saved_dilation) # dilation is not stored for backward pass print("padding: ", b.grad_fn._saved_padding) print("weight: ", b.grad_fn._saved_weight) ``` Sample of generated code: ``` PyObject* THPThnnConv2DBackward_self_getter(THPCppFunction self, void _unused) { const auto& prop = static_cast<ThnnConv2DBackward>(self->cdata.get())->self_; return THPVariable_Wrap(prop.unpack()); } PyObject THPThnnConv2DBackward_weight_getter(THPCppFunction self, void _unused) { const auto& prop = static_cast<ThnnConv2DBackward>(self->cdata.get())->weight_; return THPVariable_Wrap(prop.unpack()); } PyObject THPThnnConv2DBackward_kernel_size_getter(THPCppFunction self, void _unused) { auto prop = static_cast<ThnnConv2DBackward>(self->cdata.get())->kernel_size; PyObject tup = PyTuple_New((Py_ssize_t) prop.size()); for (int i = 0; i < prop.size(); i++) { PyTuple_SetItem(tup, (Py_ssize_t) i, PyLong_FromUnsignedLong((uint64_t) prop[i])); } return tup; } PyObject* THPThnnConv2DBackward_stride_getter(THPCppFunction self, void _unused) { auto prop = static_cast<ThnnConv2DBackward>(self->cdata.get())->stride; PyObject tup = PyTuple_New((Py_ssize_t) prop.size()); for (int i = 0; i < prop.size(); i++) { PyTuple_SetItem(tup, (Py_ssize_t) i, PyLong_FromUnsignedLong((uint64_t) prop[i])); } return tup; } PyObject* THPThnnConv2DBackward_padding_getter(THPCppFunction self, void _unused) { auto prop = static_cast<ThnnConv2DBackward>(self->cdata.get())->padding; PyObject tup = PyTuple_New((Py_ssize_t) prop.size()); for (int i = 0; i < prop.size(); i++) { PyTuple_SetItem(tup, (Py_ssize_t) i, PyLong_FromUnsignedLong((uint64_t) prop[i])); } return tup; } PyObject* THPThnnConv2DBackward_finput_getter(THPCppFunction self, void _unused) { const auto& prop = static_cast<ThnnConv2DBackward>(self->cdata.get())->finput_; return THPVariable_Wrap(prop.unpack()); } PyObject THPThnnConv2DBackward_fgrad_input_getter(THPCppFunction self, void _unused) { const auto& prop = static_cast<ThnnConv2DBackward>(self->cdata.get())->fgrad_input_; return THPVariable_Wrap(prop.unpack()); } static struct PyGetSetDef ThnnConv2DBackward_properties[] = { THP_FUNCTION_DEFAULT_PROPERTIES, {(char)"_saved_self", (getter)THPThnnConv2DBackward_self_getter, nullptr, nullptr, nullptr}, {(char)"_saved_weight", (getter)THPThnnConv2DBackward_weight_getter, nullptr, nullptr, nullptr}, {(char)"_saved_kernel_size", (getter)THPThnnConv2DBackward_kernel_size_getter, nullptr, nullptr, nullptr}, {(char)"_saved_stride", (getter)THPThnnConv2DBackward_stride_getter, nullptr, nullptr, nullptr}, {(char)"_saved_padding", (getter)THPThnnConv2DBackward_padding_getter, nullptr, nullptr, nullptr}, {(char)"_saved_finput", (getter)THPThnnConv2DBackward_finput_getter, nullptr, nullptr, nullptr}, {(char)"_saved_fgrad_input", (getter)THPThnnConv2DBackward_fgrad_input_getter, nullptr, nullptr, nullptr}, {nullptr} /* sentinel */ }; ... void initialize_autogenerated_functions() { ... static PyTypeObject ThnnConv2DBackwardClass; addClass<ThnnConv2DBackward>(ThnnConv2DBackwardClass, "ThnnConv2DBackward", ThnnConv2DBackward_properties); ... } ``` Before: ``` void initialize_autogenerated_functions() { ... static PyTypeObject ThnnConv2DBackwardClass; addClass<ThnnConv2DBackward>(ThnnConv2DBackwardClass, "ThnnConv2DBackward"); ... } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/52451 Reviewed By: H-Huang Differential Revision: D26692633 Pulled By: soulitzer fbshipit-source-id: a09b5b8138e4641093aff68c7e9dffdbb96911b8	2021-03-02 15:20:56 -08:00
Jane Xu	09516d2d0c	Reenables skipped tests for all CUDA versions except 11.2 (#52359 ) Summary: This PR adds functionality to skip a test based on CUDA version. This way, we can be more specific when skipping a test, such as when the test only fails for a particular CUDA version. This allows us to add back the skipped tests for CUDA 11.2 for other CUDA versions, such as 10.1 and 11.1. I tested this locally (by using 11.0 instead of 11.2), but will run all the CI to make sure it works. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52359 Reviewed By: walterddr Differential Revision: D26487951 Pulled By: janeyx99 fbshipit-source-id: 45c71cc6105ffd9985054880009cf68ea5ef3f6a	2021-02-19 15:30:55 -08:00
Jeffrey Wan	aa2fede201	Fix autograd when `inputs` contains tensors without materialized grad_fn (#51940 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/39784 At the time the issue was filed, there was only issue (1) below. There are actually now two issues here: 1. We always set all inputs passed in through `inputs` arg as `needed = True` in exec_info. So if we pass in an input that has a grad_fn that is not materialized, we create an entry of exec_info with nullptr as key with `needed = True`. Coincidentally, when we perform simple arithmetic operations, such as "2 * x", one of the next edges of mul is an invalid edge, meaning that its grad_fn is also nullptr. This causes the discovery algorithm to set all grad_fns that have a path to this invalid_edge as `needed = True`. 2. Before the commit that enabled the engine skipped the dummy node, we knew that root node is always needed, i.e., we hardcode `exec_info[&graph_root]=true`. The issue was that this logic wasn't updated after the code was updated to skip the graph root. To address (1), instead of passing in an invalid edge if an input in `inputs` has no grad_fn, we create a dummy grad_fn. This is done in both python and cpp entry points. The alternative is to add logic for both backward() and grad() cases to check whether the grad_fn is nullptr and set needed=false in that case (the .grad() case would be slightly more complicated than the .backward() case here). For (2), we perform one final iteration of the discovery algorithm so that we really know whether we need to execute the graph root. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51940 Reviewed By: VitalyFedyunin Differential Revision: D26369529 Pulled By: soulitzer fbshipit-source-id: 14a01ae7988a8de621b967a31564ce1d7a00084e	2021-02-11 09:22:15 -08:00
Jane Xu	bff8194522	Replace 11.1 with 11.2 on CI for Windows (#51598 ) Summary: Adding CUDA 11.2 to Windows CI. Disabled tests: The following ran into `CUDA error: misaligned address` for CUDA 11.2: (issue linked below) `test_where_scalar_valid_combination_cuda_complex128` in test_torch.py `test_sgn_complex_cuda` in test_autograd.py The following ran into `CUDA error: too many resources requested for launch` for CUDA 11.2: (https://github.com/pytorch/pytorch/issues/52002) test_EmbeddingBag_per_sample_weights_and_new_offsets_cuda_int64_float64 test_EmbeddingBag_per_sample_weights_and_offsets_cuda_int64_float64 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51598 Reviewed By: mrshenli Differential Revision: D26344965 Pulled By: janeyx99 fbshipit-source-id: 3c9a4ed16d748969e96593220ec0a9f33e1ffcef	2021-02-10 17:59:11 -08:00
Nikita Shulga	d5a2429c24	Fix flake8 failures (#51963 ) Summary: Fixes flake8 failures in test_autograd.py by using `gradcheck` from `torch.testing._internal.common_utils` rather than directly from`torch.autograd.gradcheck` Pull Request resolved: https://github.com/pytorch/pytorch/pull/51963 Reviewed By: albanD Differential Revision: D26339107 Pulled By: malfet fbshipit-source-id: 63e0f12df16b70e394097ad88852984c1848a9e6	2021-02-09 07:02:01 -08:00
Jeffrey Wan	7b9ca54ecf	Reset checkpoint_valid flag when error happens during function execution (#51746 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/37874, https://github.com/pytorch/pytorch/issues/51743 Uses RAII to manage the flag so that it gets reset properly on exception Pull Request resolved: https://github.com/pytorch/pytorch/pull/51746 Reviewed By: izdeby Differential Revision: D26319619 Pulled By: soulitzer fbshipit-source-id: ea1235438ba516f99195c83fa23d5880f9977c93	2021-02-08 17:48:25 -08:00
Chester Liu	58eb23378f	Clean up usage of torch._six partially (#49785 ) Summary: See https://github.com/pytorch/pytorch/issues/42919 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49785 Reviewed By: mruberry Differential Revision: D25963833 Pulled By: bugra fbshipit-source-id: 11c90d6b8d3f206c9d0a4d8621b773beb10c6ba2	2021-02-08 13:58:34 -08:00
jiej	4d703d040b	Linear autodiff revert revert (#51613 ) Summary: patch PR https://github.com/pytorch/pytorch/issues/50856 and rollbak the revert D26105797 (`e488e3c443`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/51613 Reviewed By: mruberry Differential Revision: D26253999 Pulled By: ngimel fbshipit-source-id: a20b1591de06dd277e4cd95542e3291a2f5a252c	2021-02-04 16:32:05 -08:00
Jeffrey Wan	2e8e560cdf	Fix anomaly mode memory leak (#51610 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/51349 The memory leak happens when 1) `create_graph` is True AND 2) detect anomaly mode is on. When a backward node's constructor is called during backward, the current evaluating node is assigned as a "parent" of the created node. The code that assigns the parent encounters the below issue: `functionToPyObject(parent_node)` returns a new PyObject (with refcount 1) or if PyObject already exists, increments its refcount by 1. However [PyDict_SetItem](`1b55b65638/Objects/dictobject.c (L1532)`) calls into [insertdict](https://github.com/python/cpython/blob/v3.8.1/Objects/dictobject.c#L1034) which increments refcount again. This means that when dict is destroyed, the refcount of the PyObject is at least one. This keeps `parent_node` (the backward function) alive, which then keeps the saved tensor alive. Similar calls in the codebase to `functionToPyObject` won't require Py_DECREF if it is then passed into a tuple (instead of dict), because the analogous PyTuple_SetItem call does not increment refcount. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51610 Reviewed By: albanD Differential Revision: D26240336 Pulled By: soulitzer fbshipit-source-id: 2854528f66fab9dbce448f8a7ba732ce386a7310	2021-02-04 11:53:37 -08:00
Ilia Cherniavskii	f1f9b049d8	[profiler] Support top-level memory events (#51421 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51421 Mark memory events that did not happen within an operator context explicitly in the profiler output. Test Plan: python test/test_profiler.py -k test_memory_profiler Reviewed By: ngimel Differential Revision: D26166518 Pulled By: ilia-cher fbshipit-source-id: 3c14d3ac25a7137733ea7cc65f0eb48693a98f5e	2021-02-04 04:14:15 -08:00
anjali411	bd3ae117fc	Fixes cat backward formula to return correct gradient values for R -> C case (#51681 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51681 Fixes https://github.com/pytorch/pytorch/issues/51627 Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D26238748 Pulled By: anjali411 fbshipit-source-id: 1dc47f8ddddbf3f2c176f21e5dcee917f84f4c93	2021-02-03 21:29:55 -08:00
Richard Zou	45e5562fcc	Beef up {jacobian, hessian} vectorize docs; eliminate a warning (#51638 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51638 This PR makes the following doc changes: - Makes it clear to users that they should use vectorize "at their own risk" - Makes it clear that vectorize uses the "experimental prototype vmap" so that when users see error messages related to vmap they will know where it is coming from. This PR also: - makes it so that {jacobian, hessian} call a version of vmap that doesn't warn the user that they are using an "experimental prototype". The regular torch.vmap API does warn the user about this. This is to improve a UX a little because the user already knows from discovering the flag and reading the docs what they are getting themselves into. Test Plan: - Add test that {jacobian, hessian} with vectorize=True don't raise warnings Reviewed By: albanD Differential Revision: D26225402 Pulled By: zou3519 fbshipit-source-id: 1a6db920ecf10597fb2e0c6576f510507d999c34	2021-02-03 17:15:16 -08:00
Natalia Gimelshein	26f9ac98e5	Revert D26105797: [pytorch][PR] Exposing linear layer to fuser Test Plan: revert-hammer Differential Revision: D26105797 (`e488e3c443`) Original commit changeset: 6f7cedb9f6e3 fbshipit-source-id: f0858cefed76d726e9dba61e51e1eaf2af4c99c5	2021-02-02 17:39:17 -08:00
jiej	e488e3c443	Exposing linear layer to fuser (#50856 ) Summary: 1. enabling linear in autodiff; 2. remove control flow in python for linear; Pull Request resolved: https://github.com/pytorch/pytorch/pull/50856 Reviewed By: pbelevich Differential Revision: D26105797 Pulled By: eellison fbshipit-source-id: 6f7cedb9f6e3e46daa24223d2a6080880498deb4	2021-02-02 15:39:01 -08:00
Joel Schlosser	8f0968f899	Fix: Bad autograd side effects from printing (#51364 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/49756 ## Background Fix applied here is to remove the grad enabled check from `collect_next_edges`, unconditionally returning the actual collected edges. This pushes the responsibility for determining whether the function should be called without grad mode to its call-sites. With this update, `collect_next_edges` will no longer incorrectly return an empty list, which caused the problem described in the issue. Three call-sites depended on this behavior and have been updated. Beyond bad printing side effects, this fix addresses the more general issue of accessing `grad_fn` with grad mode disabled after an in-place operation on a view. The included test verifies this without the use of print. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51364 Test Plan: ``` python test/test_autograd.py TestAutogradDeviceTypeCPU.test_inplace_view_then_no_grad_cpu ``` Reviewed By: zou3519 Differential Revision: D26190451 Pulled By: jbschlosser fbshipit-source-id: 9b004a393463f8bd4ac0690e5e53c07a609f87f0	2021-02-02 09:30:27 -08:00
Ivan Yashchuk	30675d0921	Added OpInfo-based testing of triangular_solve (#50948 ) Summary: Added OpInfo-based testing of `torch.triangular_solve`. These tests helped to discover that CPU `triangular_solve` wasn't working for empty matrices and for CUDA inputs a warning was printed to the terminal. It is fixed now. CUDA gradgrad checks are skipped. ``` 11.44s call test/test_ops.py::TestGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_complex128 2.97s call test/test_ops.py::TestGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_float64 1.60s call test/test_ops.py::TestGradientsCPU::test_fn_gradgrad_triangular_solve_cpu_complex128 1.36s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_triangular_solve_cuda_complex128 1.20s call test/test_ops.py::TestGradientsCUDA::test_fn_grad_triangular_solve_cuda_complex128 0.86s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_triangular_solve_cuda_complex64 0.85s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_triangular_solve_cuda_complex128 0.81s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_triangular_solve_cuda_float64 0.77s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_triangular_solve_cuda_float32 0.46s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_triangular_solve_cpu_complex128 0.44s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_triangular_solve_cpu_complex64 0.44s call test/test_ops.py::TestGradientsCUDA::test_fn_grad_triangular_solve_cuda_float64 0.42s call test/test_ops.py::TestGradientsCPU::test_fn_gradgrad_triangular_solve_cpu_float64 0.40s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_triangular_solve_cpu_float32 0.40s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_triangular_solve_cpu_float64 0.17s call test/test_ops.py::TestGradientsCPU::test_fn_grad_triangular_solve_cpu_complex128 ``` Ref. https://github.com/pytorch/pytorch/issues/50006 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50948 Reviewed By: ailzhang Differential Revision: D26123998 Pulled By: mruberry fbshipit-source-id: 54136e8fc8a71f107dddb692c5be298c6d5ed168	2021-01-29 10:31:07 -08:00
Jeffrey Wan	c0966914bc	Internal gradcheck wrapper in testing._internal that sets certain flags to True (#51133 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/49409 There are many call sites where, gradcheck/gradgradcheck is now being implicitly invoked with `check_batched_grad` as True, but they were previously False. Cases fall into two basic categories: 1) the call site was previously using `torch.autograd.gradcheck` but is now changed to use the globally imported function instead 3) the call site was already using globally imported function, but does not explicitly pass `check_batched_grad` flag Only in the _assertGradAndGradgradChecks cases, which are infrequent, I assumed that the the author is aware that omitting the flag means not applying check_batched_grad=True. (but maybe that is not the case?) Overall this PR in its current state assumes that unless the author explicitly specified `check_batched_grad=False`, they were just probably not aware of this flag and did not mean to have this flag as False. So far exceptions to the above (as discovered by CI) include: - Mkldnn (opaque tensors do not have strides) https://app.circleci.com/pipelines/github/pytorch/pytorch/264416/workflows/e4d87886-6247-4305-8526-2696130aa9a4/jobs/10401882/tests - all cases in test_sparse (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407103) - all cases in test_overrides (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407236) - test_autograd (test_LSTM_grad_and_gradgrad) - (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407235) - test_data_parallel (test_data_parallel_buffers_requiring_grad) - SIGSEGV (https://app.circleci.com/pipelines/github/pytorch/pytorch/264820/workflows/14d89503-040d-4e3d-9f7b-0bc04833589b/jobs/10422697) - test_nn (https://app.circleci.com/pipelines/github/pytorch/pytorch/264919/workflows/df79e3ed-8a31-4a8e-b584-858ee99686ff/jobs/10427315) Possible TODO is to prevent new tests from invoking external gradcheck. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51133 Reviewed By: ezyang Differential Revision: D26147919 Pulled By: soulitzer fbshipit-source-id: dff883b50f337510a89f391ea2fd87de2d531432	2021-01-29 09:13:37 -08:00
Ivan Yashchuk	6e4746c1ac	Port cholesky_inverse to ATen (#50269 ) Summary: Now we can remove `_th_potri`! Compared to the original TH-based `cholesky_inverse`, complex (https://github.com/pytorch/pytorch/issues/33152) and batched inputs (https://github.com/pytorch/pytorch/issues/7500) are now supported both on CPU and CUDA. Closes https://github.com/pytorch/pytorch/issues/24685. Closes https://github.com/pytorch/pytorch/issues/24543. Ref. https://github.com/pytorch/pytorch/issues/49421, https://github.com/pytorch/pytorch/issues/42666 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50269 Reviewed By: bdhirsh Differential Revision: D26047548 Pulled By: anjali411 fbshipit-source-id: e4f191e39c684f241b7cb0f4b4c025de082cccef	2021-01-28 16:24:41 -08:00
Joel Schlosser	0b5303e833	Propagate CreationMeta when chaining views (#51061 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/49824 ## Background When creating a view of a view, there was a possibility that the new view would be less restrictive than the previous view, incorrectly sidestepping the error that should be thrown when using in-place operations on the new view. The fix addresses this by propagating `CreationMeta` from the previous view to the new view. Currently, the old view's `creation_meta` is only propagated when the new view's `creation_meta == CreationMeta::DEFAULT`. This ensures that the new view is not less restrictive than the previous view wrt. allowing in-place operations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51061 Test Plan: ``` python test/test_autograd.py TestAutogradDeviceTypeCPU.test_inplace_view_of_multiple_output_view_cpu python test/test_autograd.py TestAutogradDeviceTypeCUDA.test_inplace_view_of_multiple_output_view_cuda python test/test_autograd.py TestAutogradDeviceTypeCPU.test_inplace_multiple_output_view_of_view_cpu python test/test_autograd.py TestAutogradDeviceTypeCUDA.test_inplace_multiple_output_view_of_view_cuda ``` Reviewed By: heitorschueroff Differential Revision: D26076434 Pulled By: jbschlosser fbshipit-source-id: c47f0ddcef9b8449427b671aff9ad08edca70fcd	2021-01-27 09:00:51 -08:00
Richard Zou	22ac4f3c59	Add `vectorize` flag to torch.autograd.functional.{jacobian, hessian} (#50915 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50915 Fixes #50584 Add a vectorize flag to torch.autograd.functional.jacobian and torch.autograd.functional.hessian (default: False). Under the hood, the vectorize flag uses vmap as the backend to compute the jacobian and hessian, respectively, providing speedups to users. Test Plan: - I updated all of the jacobian and hessian tests to also use vectorized=True - I added some simple sanity check tests that check e.g. jacobian with vectorized=False vs jacobian with vectorized=True. - The mechanism for vectorized=True goes through batched gradient computation. We have separate tests for those (see other PRs in this stack). Reviewed By: heitorschueroff Differential Revision: D26057674 Pulled By: zou3519 fbshipit-source-id: a8ae7ca0d2028ffb478abd1b377f5b49ee39e4a1	2021-01-27 07:32:30 -08:00
narain pattabhiraman	7cb4712b38	count_nonzero with requires grad (#50866 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50792 fixes `count_nonzero` for tensors with requires_grad and also includes test Pull Request resolved: https://github.com/pytorch/pytorch/pull/50866 Reviewed By: ejguan Differential Revision: D25996202 Pulled By: albanD fbshipit-source-id: 61f2d7d62dd04e574a65ad03ef3a358b141fbae7	2021-01-22 11:19:59 -08:00
anjali411	4511f2cc9d	Clean up complex autograd test list (#50615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50615 The method tests for some of the ops have been ported to the new OpInfo based tests. This PR removes those op names from `complex_list` in `test_autograd.py` Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25931268 Pulled By: anjali411 fbshipit-source-id: 4d08626431c61c34cdca18044933e4f5b9b25232	2021-01-19 11:00:13 -08:00
Richard Zou	1154a8594e	Add instructional error message for cudnn RNN double backward workaround (#33884 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33884 Mitigates https://github.com/pytorch/pytorch/issues/5261. It's not possible for us to support cudnn RNN double backwards due to limitations in the cudnn API. This PR makes it so that we raise an error message if users try to get the double backward on a cudnn RNN; in the error message we suggest using the non-cudnn RNN. Test Plan: - added some tests to check the error message Reviewed By: albanD Differential Revision: D20143544 Pulled By: zou3519 fbshipit-source-id: c2e49b3d8bdb9b34b561f006150e4c7551a78fac	2021-01-19 09:05:36 -08:00
Richard Zou	f7a8bfd0a1	Add batched grad testing to gradcheck, turn it on in test_autograd (#50592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50592 This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck. It defaults to False because gradcheck is a public API and I don't want to break any existing non-pytorch users of gradcheck. This: - runs grad twice with two grad outputs, a & b - runs a vmapped grad with torch.stack([a, b]) - compares the results of the above against each other. Furthermore: - `check_batched_grad=True` is set to be the default for gradcheck/gradgradcheck inside of test_autograd.py. This is done by reassigning to the gradcheck object inside test_autograd - I manually added `check_batched_grad=False` to gradcheck instances that don't support batched grad. - I added a denylist for operations that don't support batched grad. Question: - Should we have a testing only gradcheck (e.g., torch.testing.gradcheck) that has different defaults from our public API, torch.autograd.gradcheck? Future: - The future plan for this is to repeat the above for test_nn.py (the autogenerated test will require a denylist) - Finally, we can repeat the above for all pytorch test files that use gradcheck. Test Plan: - run tests Reviewed By: albanD Differential Revision: D25925942 Pulled By: zou3519 fbshipit-source-id: 4803c389953469d0bacb285774c895009059522f	2021-01-19 06:48:28 -08:00
anjali411	227acc2e51	Complex autograd support for torch.{baddbmm, addbmm, addmm, addmv} (#50632 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50632 I'll port the following method tests in follow-up PRs: `'baddbmm', 'addbmm', 'addmv', 'addr'` After the tests are ported to OpInfo based tests, it would also be much easier to add tests with complex alpha and beta values. Edit- it seems like it's hard to port the broadcasting variant tests because one ends up skipping `test_inplace_grad` and `test_variant_consistency_eager` even for the case when inputs are not required to be broadcasted. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D25947471 Pulled By: anjali411 fbshipit-source-id: 9faa7f1fd55a1269bad282adac2b39d19bfa4591	2021-01-18 14:05:02 -08:00
Nikita Shulga	9efe15313a	Revert D25563542: Add batched grad testing to gradcheck, turn it on in test_autograd Test Plan: revert-hammer Differential Revision: D25563542 (`443412e682`) Original commit changeset: 125dea554abe fbshipit-source-id: 0564735f977431350b75147ef209e56620dbab64	2021-01-14 19:19:02 -08:00
Richard Zou	443412e682	Add batched grad testing to gradcheck, turn it on in test_autograd (#49120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49120 This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck. It defaults to False because gradcheck is a public API and I don't want to break any existing non-pytorch users of gradcheck. This: - runs grad twice with two grad outputs, a & b - runs a vmapped grad with torch.stack([a, b]) - compares the results of the above against each other. Furthermore: - `check_batched_grad=True` is set to be the default for gradcheck/gradgradcheck inside of test_autograd.py. This is done by reassigning to the gradcheck object inside test_autograd - I manually added `check_batched_grad=False` to gradcheck instances that don't support batched grad. - I added a denylist for operations that don't support batched grad. Question: - Should we have a testing only gradcheck (e.g., torch.testing.gradcheck) that has different defaults from our public API, torch.autograd.gradcheck? Future: - The future plan for this is to repeat the above for test_nn.py (the autogenerated test will require a denylist) - Finally, we can repeat the above for all pytorch test files that use gradcheck. Test Plan: - run tests Reviewed By: albanD Differential Revision: D25563542 Pulled By: zou3519 fbshipit-source-id: 125dea554abefcef0cb7b487d5400cd50b77c52c	2021-01-14 08:13:23 -08:00
Howard Huang	ec51b67282	Fix elu backward operation for negative alpha (#49272 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47671 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49272 Test Plan: ``` x = torch.tensor([-2, -1, 0, 1, 2], dtype=torch.float32, requires_grad=True) y = torch.nn.functional.elu_(x.clone(), alpha=-2) grads = torch.tensor(torch.ones_like(y)) y.backward(grads) ``` ``` RuntimeError: In-place elu backward calculation is triggered with a negative slope which is not supported. This is caused by calling in-place forward function with a negative slope, please call out-of-place version instead. ``` Reviewed By: albanD Differential Revision: D25569839 Pulled By: H-Huang fbshipit-source-id: e3c6c0c2c810261566c10c0cc184fd81b280c650	2021-01-11 12:52:52 -08:00
Antonio Cuni	b5ab0a7f78	Improve torch.linalg.qr (#50046 ) Summary: This is a follow up of PR https://github.com/pytorch/pytorch/issues/47764 to fix the remaining details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50046 Reviewed By: zou3519 Differential Revision: D25825557 Pulled By: mruberry fbshipit-source-id: b8e335e02265e73484a99b0189e4cc042828e0a9	2021-01-08 09:52:31 -08:00
anjali411	8fb5f16931	Complex backward for indexing, slicing, joining, and mutating ops (#49552 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49552 This PR: 1. Migrates independent autograd test for `hstack`, `dstack`, `vstack`, `movedim`, `moveaxis` from `test_autograd.py` to the new `OpInfo` based tests. 2. Migrates autograd test for `gather`, `index_select` from the method_tests to the new `OpInfo` based tests. 2. Enables complex backward for `stack, gather, index_select, index_add_` and adds tests for complex autograd for all the above mentioned ops. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25682511 Pulled By: anjali411 fbshipit-source-id: 5d8f89db4a9ec340ab99a6196987d44a23e2c6c6	2021-01-04 19:44:15 -08:00

... 3 4 5 6 7 ...

1104 Commits