pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
soulitzer	99ffeff949	[forward ad] Sync conj for between primal and tangent on set forward grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/78358 Approved by: https://github.com/Lezcano, https://github.com/zou3519	2022-06-08 04:20:17 +00:00
yuguo68	efdb4192bc	set data permits requires_grad=True on integer tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/78436 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-06-01 15:56:32 +00:00
soulitzer	c88367442d	[forward ad] forbid non-float non-complex tangent and primal Pull Request resolved: https://github.com/pytorch/pytorch/pull/78361 Approved by: https://github.com/albanD	2022-05-31 20:58:19 +00:00
Elias Ellison	678213ead2	Fake Tensor Part 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77969 Approved by: https://github.com/ezyang	2022-05-31 16:20:35 +00:00
Taylor Robie	e17f14fab2	[Profiler] Propagate metadata into `Engine::evaluate_function` event. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77696 https://github.com/pytorch/pytorch/pull/63619 added a RECORD_FUNCTION guard to make calls to `Engine::evaluate_function` visible regardless of the underlying op. While useful, this creates a call that looks like a forward call that somewhat complicates stitching forward and backward ops. I don't want to add complexity (and therefore work) on the hot path; instead it's fairly straightforward to stitch things back together in post. This PR simply propagates sequence number and forward tid info up to the `evaluate_function` event. Differential Revision: [D36302562](https://our.internmc.facebook.com/intern/diff/D36302562/) Approved by: https://github.com/aaronenyeshi	2022-05-22 22:39:13 +00:00
Alban Desmaison	090eddf1c7	Fix MPS interaction with autograd engine Pull Request resolved: https://github.com/pytorch/pytorch/pull/77644 Approved by: https://github.com/kulinseth, https://github.com/soulitzer, https://github.com/seemethere	2022-05-17 21:26:16 +00:00
Mikayla Gawarecki	7ba4e124e6	Bugfix gradient formula for index_reduce('prod') + separate out sample_inputs for index_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/77382 Approved by: https://github.com/cpuhrsch	2022-05-16 18:43:57 +00:00
soulitzer	beb405035c	Update forward AD metadata check to skip stride check when size is 0 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77269 Approved by: https://github.com/albanD	2022-05-16 15:53:17 +00:00
Kulin Seth	e011a8e18b	Enable PyTorch operations on MPS Backend. (#77343 ) Add PyTorch operations to MPS backend. - https://github.com/pytorch/pytorch/issues/77394 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343 Approved by: https://github.com/albanD	2022-05-13 18:28:53 +00:00
Mikayla Gawarecki	465e0ae266	Bugfix scatter_reduce backward formulas Pull Request resolved: https://github.com/pytorch/pytorch/pull/76523 Approved by: https://github.com/albanD	2022-05-05 20:22:39 +00:00
Xiaodong Wang	2291960d3f	Back out "record_function: update to use custom_class API" (#76253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76253 We're observing large QPS regression on the original PR https://github.com/pytorch/pytorch/pull/72302. For the training job we had, it regressed from 720k QPS to 450k QPS (see the test plan in FB internal). We suspect this is because the api was changed from `_record_function_enter` to `_record_function_enter_new`, and we're running experiments to confirm that. Will add more details when the runs in the test plan has finished. For now, it's better to revert the diff to unblock internal usecases and we can think about how to reland this diff later. Original commit changeset: dc9939f1fa6d Original Phabricator Diff: D35257354 Test Plan: on trunk: f338665947 with this diff: f338502850 Reviewed By: malfet, robieta Differential Revision: D35853300 fbshipit-source-id: dd38042aeacb848f66756491a4c849c7c652a0e1	2022-04-26 17:49:57 -04:00
Alban Desmaison	eb69e8a3ed	Revert "Revert "record_function: update to use custom_class API"" This reverts commit `3f9f35b9f8`. This should be done via a clean revert as this has been in master for a long time. Doing a quick fix here to make sure we don't break master. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76172 Approved by: https://github.com/atalman	2022-04-21 14:18:28 +00:00
PyTorch MergeBot	3f9f35b9f8	Revert "record_function: update to use custom_class API" This reverts commit `5630c5ac75`. Reverted https://github.com/pytorch/pytorch/pull/72302 on behalf of https://github.com/atalman	2022-04-21 13:59:48 +00:00
albanD	cd0591dff3	Change default TLS behavior in dispatch to favor is-a style Pull Request resolved: https://github.com/pytorch/pytorch/pull/75827 Approved by: https://github.com/ezyang	2022-04-20 17:32:29 +00:00
Peter Bell	cc56fac213	Fix complex to real casting warning in _to_copy backward Fixes #75781 A Real->Complex cast should result in a gradient with no imaginary component, so discarding the imaginary component is expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75805 Approved by: https://github.com/albanD	2022-04-19 14:04:13 +00:00
Ivan Yashchuk	38a758e251	Add forward AD for rsub, polar, and FFT This PR adds forward AD support for: - torch.rsub - tensor.\_\_rsub\_\_ - torch.polar - torch.fft.fft - torch.fft.fft2 - torch.fft.fftn - torch.fft.hfft - torch.fft.hfft2 - torch.fft.hfftn - torch.fft.rfft - torch.fft.rfft2 - torch.fft.rfftn - torch.fft.ifft - torch.fft.ifft2 - torch.fft.ifftn - torch.fft.ihfft - torch.fft.ihfft2 - torch.fft.ihfftn - torch.fft.irfft - torch.fft.irfft2 - torch.fft.irfftn - torch.stft - torch.istft Ref. https://github.com/pytorch/pytorch/issues/71117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75326 Approved by: https://github.com/soulitzer	2022-04-08 05:01:01 +00:00
Ivan Yashchuk	65ed1e3526	Add forward AD for torch.atan2 This PR adds a formula for the total differential of the atan2 function. Ref. https://github.com/pytorch/pytorch/issues/71117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75027 Approved by: https://github.com/soulitzer	2022-04-01 05:24:19 +00:00
Nikita Shulga	bfac65dfe5	[testing] Update dispatch macros (#74977 ) This PR is reland of #74289 Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>	2022-03-30 14:13:21 -07:00
PyTorch MergeBot	2e4152b118	Revert "[testing] Update dispatch macros" This reverts commit `eed19a0f38`. Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet	2022-03-30 19:52:37 +00:00
Khushi Agrawal	eed19a0f38	[testing] Update dispatch macros Hi, This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that). Please take a look. Thanks! cc: @pmeier @mruberry @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289 Approved by: https://github.com/pmeier, https://github.com/mruberry	2022-03-30 16:10:16 +00:00
Peter Bell	5630c5ac75	record_function: update to use custom_class API Merge after forward-compatibility period is over. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72302 Approved by: https://github.com/albanD	2022-03-30 15:57:28 +00:00
Kurt Mohler	5375b2e994	Resolve `int[]?` arguments to new OptionalIntArrayRef class This PR uses the `OptionalArrayRef` template class that was drafted in #64084. Fixes #44409 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70864 Approved by: https://github.com/ezyang	2022-03-26 01:45:50 +00:00
Richard Zou	a75c718d7c	[reland] Update tls logic to work better with guarded call (#73925 ) This PR relands https://github.com/pytorch/pytorch/pull/73925 which we reverted due to a large breakage in functorch. As a part of the reland, this PR adds a change we agreed upon in https://docs.google.com/document/d/1i7Y9VZp9PxtgVcrQh6nGQXkXkPc1uMep0dM-OMOGJ9o/edit The change is moving the PythonTLSSnapshot key after DynamicLayerFrontMode. Test Plan: - I tested this with an updated version of functorch and all the tests pass so I think we are out of the woods. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74577 Approved by: https://github.com/albanD	2022-03-25 19:51:10 +00:00
Richard Zou	a9d9f91f31	Revert "Update tls logic to work better with guarded call (#73925 )" This reverts commit `dff02851d1`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74268 Approved by: https://github.com/albanD	2022-03-16 17:00:19 +00:00
Nikita Shulga	ef066f0832	Revert D34856571: [pytorch][PR] Replace `get_all_` type macros with the ATen dispatch macros. Test Plan: revert-hammer Differential Revision: D34856571 (`3ded7b1da3`) Original commit changeset: 0dca038bcad5 Original Phabricator Diff: D34856571 (`3ded7b1da3`) fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386 (cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)	2022-03-15 22:07:11 +00:00
Khushi Agrawal	3ded7b1da3	Replace `get_all_` type macros with the ATen dispatch macros. (#71561 ) Summary: Hi, Team! The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros. The files it iterates over are: (Thanks, Lezcano, for the idea!!) <details> <summary> `test/test_autograd.py`</summary> <p> ```python 43:from torch.testing._internal.common_dtype import get_all_dtypes 8506: floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point] ``` </p> </details> <details> <summary> `test/test_binary_ufuncs.py`</summary> <p> ```python 26: all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes, 27: get_all_complex_dtypes, get_all_fp_dtypes, 935: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1035: dtypes(get_all_dtypes( 1488: dtypes((get_all_dtypes(include_bool=False, include_bfloat16=False))) 1879: dtypes(product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False))) 1887: dtypes((get_all_int_dtypes() + [torch.bool])) 1913: dtypes((get_all_fp_dtypes())) 1941: dtypes((get_all_fp_dtypes())) 1977: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 2019: dtypes(product(get_all_fp_dtypes(), get_all_fp_dtypes())) 2048: dtypes(get_all_dtypes()) 2110: dtypes(product(get_all_dtypes(include_complex=False), 2111: get_all_dtypes(include_complex=False))) 2128: types = [torch.bool, torch.bfloat16] + get_all_int_dtypes() 2173: if dtypes[1] in get_all_fp_dtypes(): 2178: dtypes(product(get_all_fp_dtypes(), 2179: get_all_fp_dtypes())) 2260: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2261: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2273: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2274: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2307: dtypes(get_all_math_dtypes('cpu')) 2319: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 2331: dtypes(get_all_int_dtypes()) 2356: dtypes(get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False)) 2393: if dtype in get_all_int_dtypes(): 2614: dtypes(get_all_dtypes()) 2624: dtypes(tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2))) 2806: dtypes(list(product(get_all_dtypes(include_complex=False), 2807: get_all_dtypes(include_complex=False)))) 2866: dtypes(list(product(get_all_complex_dtypes(), 2867: get_all_complex_dtypes()))) 2902: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2906: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2910: dtypes(product(get_all_dtypes(), get_all_dtypes())) 3019: dtypes = [torch.float, torch.double] + get_all_complex_dtypes() 3221: dtypes(get_all_dtypes(include_complex=False)) 3407: dtypes(list(product(get_all_dtypes(include_bool=False), 3408: get_all_dtypes(include_bool=False)))) 3504: dtypes(product(get_all_dtypes(include_complex=False, include_bfloat16=False), 3505: get_all_dtypes(include_complex=False, include_bfloat16=False))) 3516: if x.dtype in get_all_int_dtypes() + [torch.bool]: 3643: dtypes(product(get_all_dtypes(include_complex=False, 3645: get_all_dtypes(include_complex=False, ``` </p> </details> <details> <summary> `test/test_complex.py`</summary> <p> ```python 6:from torch.testing._internal.common_dtype import get_all_complex_dtypes 11: dtypes(get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_foreach.py`</summary> <p> ```python 18: get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 142: if dtype in get_all_int_dtypes(): 179: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 201: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 205: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 211: disable_fastpath \|= dtype not in get_all_complex_dtypes() 241: bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 246: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 248: disable_fastpath \|= dtype not in get_all_complex_dtypes() 250: disable_fastpath \|= True and dtype not in get_all_complex_dtypes() 307: disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool] 365: if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes(): 376: ops(foreach_unary_op_db, dtypes=get_all_dtypes()) 393: dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False)) 401: ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True)) 426: if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes(): 439: dtypes(get_all_dtypes()) 449: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 481: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 536: if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div: 545: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 637: ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False)) ``` </p> </details> <details> <summary> `test/test_linalg.py`</summary> <p> ```python 29: all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, 30: get_all_fp_dtypes, 111: dtypes((get_all_dtypes())) 794: float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes() 807: dtypes((get_all_int_dtypes())) 828: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 841: if dtype in get_all_complex_dtypes(): 844: dtypes(itertools.product(get_all_dtypes(), 845: get_all_dtypes())) 855: for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3): 5607: get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater))) 5608: dtypes((set(get_all_dtypes()) - {torch.half, torch.bool})) 5644: dtypes((get_all_complex_dtypes() + get_all_fp_dtypes())) 6255: dtypesIfCUDA(get_all_complex_dtypes(), 6256: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)), 6292: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6323: dtypesIfCUDA(get_all_complex_dtypes(), 6324: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6325: dtypes(get_all_complex_dtypes(), get_all_fp_dtypes()) 6358: dtypesIfCUDA(([torch.float, torch.double] + get_all_complex_dtypes())) 6556: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6668: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6741: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_nn.py`</summary> <p> ```python 37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes 50: onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \ 8862: for device in get_all_device_types(): 9629: for dt1 in get_all_math_dtypes(device): 9630: for dt2 in get_all_math_dtypes(device): 9631: for dt3 in get_all_math_dtypes(device): 9648: for input_dtype in get_all_math_dtypes(device): 9664: for input_dtype in get_all_math_dtypes(device): 13015: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13034: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13159: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17400: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17768: dtypesIfCUDA(get_all_fp_dtypes()) 17773: dtypesIfCUDA(get_all_fp_dtypes()) 17778: dtypesIfCUDA(get_all_fp_dtypes()) 17783: dtypesIfCUDA(get_all_fp_dtypes()) 17788: dtypesIfCUDA(get_all_fp_dtypes()) 17793: dtypesIfCUDA(get_all_fp_dtypes()) 17798: dtypesIfCUDA(get_all_fp_dtypes()) 17963: dtypesIfCUDA(get_all_fp_dtypes()) 17977: dtypesIfCUDA(get_all_fp_dtypes()) 18684: def test_cross_entropy_loss_prob_target_all_reductions(self, device): ``` </p> </details> <details> <summary> `test/test_numpy_interop.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import get_all_dtypes 399: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_ops.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes 86: for dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_reductions.py`</summary> <p> ```python 16: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 360: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 366: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 394: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 750: for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]: 1404: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1457: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1458: get_all_complex_dtypes())) 1465: return dtype in get_all_int_dtypes() 1494: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1501: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1507: dtypes((get_all_complex_dtypes())) 1514: dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)) 1523: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1531: if dtype in get_all_fp_dtypes(): 1608: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, 1837: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1855: dtypes((set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8})) 3219: for dtype in get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_serialization.py`</summary> <p> ```python 26:from torch.testing._internal.common_dtype import get_all_dtypes 586: for device, dtype in product(devices, get_all_dtypes()): 589: for other_dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_shape_ops.py`</summary> <p> ```python 18:from torch.testing._internal.common_dtype import get_all_dtypes 230: dtypes(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, 232: dtypesIfCUDA(get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False)) 344: dtypes(get_all_dtypes()) 443: dtypes(get_all_dtypes()) 461: dtypes(get_all_dtypes()) 570: dtypes(get_all_dtypes(include_complex=False)) ``` </p> </details> <details> <summary> `test/test_sort_and_select.py`</summary> <p> ```python 12: all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, 136: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 231: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 296: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 647: dtypesIfCUDA(get_all_fp_dtypes()) 678: dtypesIfCUDA((get_all_dtypes(include_complex=False, 682: dtypes((get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False))) 739: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 740: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) 799: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 800: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) ``` </p> </details> <details> <summary> `test/test_sparse.py`</summary> <p> ```python 20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes 29: floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes, 1963: return dtype in get_all_int_dtypes() 1994: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2103: return dtype in get_all_int_dtypes() 2138: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2626: all_sparse_dtypes = get_all_dtypes(include_complex=True) 2633: all_sparse_dtypes = get_all_dtypes(include_complex=True) 3230: dtypes(get_all_complex_dtypes(), 3231: get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 3234: get_all_fp_dtypes( ``` </p> </details> <details> <summary> `test/test_sparse_csr.py`</summary> <p> ```python 7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor 17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes 120: dtypes(get_all_dtypes()) 133: dtypes(get_all_dtypes()) 150: dtypes(get_all_dtypes()) 180: dtypes(get_all_dtypes()) 201: dtypes(get_all_dtypes()) 210: dtypes(get_all_dtypes()) 225: dtypes(get_all_dtypes()) 244: dtypes(get_all_dtypes()) 263: dtypes(get_all_dtypes()) 285: dtypes(get_all_dtypes()) 411: dtypes(get_all_dtypes()) 482: dtypes(get_all_dtypes()) 502: dtypes(get_all_dtypes()) 562: dtypes(get_all_dtypes()) 588: dtypesIfCUDA(get_all_complex_dtypes(), 589: get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater)) 745: dtypesIfCUDA(get_all_complex_dtypes(), 746: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 765: dtypesIfCUDA(get_all_complex_dtypes(), 766: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 801: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 841: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 1182: dtypes(get_all_dtypes()) 1276: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False)) 1286: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_tensor_creation_ops.py`</summary> <p> ```python 21: onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types) 23: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 150: for dt in get_all_dtypes(): 160: for dt in get_all_dtypes(): 314: dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16] 1012: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1013: get_all_complex_dtypes())) 1032: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1033: get_all_complex_dtypes())) 1050: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1051: get_all_complex_dtypes())) 1745: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1779: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1868: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1926: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1954: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 1956: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None) 1957: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 2538: for device in get_all_device_types(): 2645: for dtype in get_all_dtypes(): 2678: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False) + 2679: get_all_complex_dtypes())) 2716: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 2827: for dt in get_all_dtypes(): 2913: dtypes(get_all_dtypes(include_bool=False, include_half=False)) 2914: dtypesIfCUDA(get_all_dtypes(include_bool=False, include_half=True)) 3028: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3033: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3074: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_complex=False)) 3075: dtypesIfCUDA(((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16]) 3077: else get_all_dtypes(include_bool=False, include_half=True, include_complex=False))) 3873: dtypes(get_all_dtypes()) 3884: dtypes(get_all_dtypes(include_bool=False)) 3916: for other in get_all_dtypes(): 3922: dtypes(get_all_dtypes()) 3932: dtypes(get_all_dtypes(include_bool=False)) 3955: dtypes(get_all_dtypes(include_bool=False)) 3961: dtypes(get_all_dtypes(include_bool=False)) 3965: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_testing.py`</summary> <p> ```python 25:from torch.testing._internal.common_dtype import get_all_dtypes 31: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_torch.py`</summary> <p> ```python 51: expectedAlertNondeterministic, get_all_device_types, skipXLA) 57: get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes 296: for d in get_all_device_types(): 323: for device in get_all_device_types(): 324: for dt1 in get_all_dtypes(): 325: for dt2 in get_all_dtypes(): 343: all_dtypes = get_all_dtypes() 350: all_dtypes = get_all_dtypes() 781: for dtype in get_all_dtypes(): 986: for device in get_all_device_types(): 1017: for device in get_all_device_types(): 1018: for dtype in get_all_math_dtypes(device): 2792: for device in get_all_device_types(): 3186: dtypes(get_all_dtypes()) 3195: for error_dtype in get_all_dtypes(): 3203: dtypes(get_all_dtypes()) 3212: for error_dtype in get_all_dtypes(): 4539: dtypes(get_all_fp_dtypes()) 4545: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 4577: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 4578: dtypesIfCPU((get_all_fp_dtypes(include_half=False, include_bfloat16=True))) 4579: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4599: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4600: dtypesIfCPU((get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False))) 4601: dtypesIfCUDA((get_all_dtypes(include_bfloat16=False, include_complex=False))) 4613: for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False): 4628: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4629: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4640: dtypes(get_all_fp_dtypes()) 4723: dtypes(get_all_fp_dtypes()) 4735: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 4736: dtypesIfCUDA(get_all_fp_dtypes()) 4747: dtypes(get_all_fp_dtypes()) 4761: dtypes(get_all_fp_dtypes()) 4771: dtypes(get_all_fp_dtypes()) 4792: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 5302: dtypes(get_all_dtypes(include_bfloat16=False)) 5322: dtypes(get_all_dtypes(include_half=False, include_bfloat16=False)) 5323: dtypesIfCPU(get_all_dtypes(include_bfloat16=False)) 5324: dtypesIfCUDA(get_all_dtypes(include_bfloat16=False)) 5591: for dt in get_all_dtypes(): 5611: for dt in get_all_dtypes(): 5678: for dt in get_all_dtypes(): 5696: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 5697: dtypes(set(get_all_math_dtypes('cpu'))) 5746: dtypes(get_all_dtypes()) 5780: dtypes(get_all_dtypes()) 5885: dtypes(get_all_dtypes()) 5902: dtypes(get_all_dtypes()) 5945: dtypes(get_all_dtypes()) 5979: dtypes(get_all_dtypes(include_bool=False)) 6049: dtypes(get_all_dtypes(include_bool=False)) 6092: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6093: get_all_complex_dtypes())) 6094: dtypesIfCPU(get_all_dtypes()) 6095: dtypesIfCUDA(get_all_dtypes()) 6122: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6123: get_all_complex_dtypes())) 6124: dtypesIfCPU(get_all_dtypes()) 6125: dtypesIfCUDA(get_all_dtypes()) 6163: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6164: get_all_complex_dtypes())) 6165: dtypesIfCPU(get_all_dtypes()) 6166: dtypesIfCUDA(get_all_dtypes()) 6190: dtypes((get_all_complex_dtypes() + 6191: get_all_int_dtypes())) 6238: dtypes(get_all_dtypes()) 6323: dtypes(get_all_dtypes()) 6389: dtypes(product(get_all_dtypes(), (torch.uint8, torch.bool))) 6699: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 6700: dtypes(set(get_all_math_dtypes('cpu'))) 7452: dtypes(get_all_dtypes(include_bool=False)) 7461: dtypes(get_all_dtypes(include_bool=False)) 7477: dtypes(get_all_dtypes(include_bool=False)) 7496: dtypes(get_all_dtypes(include_bool=False)) 7538: dtypes(get_all_dtypes(include_bool=False)) 8162: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8163: get_all_complex_dtypes())) 8175: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8176: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_type_promotion.py`</summary> <p> ```python 14: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes 187: for dtype in get_all_dtypes(): 262: dtypes1 = get_all_math_dtypes('cuda') 263: dtypes2 = get_all_math_dtypes(device) 339: dtypes(itertools.product(get_all_dtypes(), get_all_dtypes())) 468: for dt1 in get_all_math_dtypes(device): 469: for dt2 in get_all_math_dtypes(device): 519: for dt1 in get_all_math_dtypes(device): 520: for dt2 in get_all_math_dtypes(device): 528: for dt in get_all_math_dtypes(device): 561: for dtype in get_all_dtypes(): 766: dtypes=get_all_math_dtypes(device)) 771: dtypes=get_all_math_dtypes(device)) 782: dtypes=get_all_math_dtypes(device)) 879: dtypes = get_all_dtypes(include_bfloat16=False) 898: dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False) 965: dtypesIfCUDA(itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False), 966: get_all_dtypes(include_bfloat16=False, include_complex=False))) 967: dtypes(itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False, 969: get_all_dtypes(include_half=False, include_bfloat16=False, 976: return dtype in get_all_int_dtypes() + [torch.bool] 979: return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False) ``` </p> </details> <details> <summary> `test/test_unary_ufuncs.py`</summary> <p> ```python 24: floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes, 25: get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 517: dtypes((get_all_int_dtypes() + [torch.bool] + 518: get_all_fp_dtypes(include_bfloat16=False))) 596: dtypes(get_all_fp_dtypes(include_half=True, include_bfloat16=False)) 611: invalid_input_dtypes = get_all_int_dtypes() + \ 612: get_all_complex_dtypes() + \ 619: for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False): 1048: dtypes(get_all_math_dtypes('cpu')) 1182: dtypesIfCUDA(get_all_fp_dtypes()) 1190: dtypesIfCUDA(get_all_fp_dtypes()) 1205: dtypesIfCUDA(get_all_fp_dtypes()) 1215: dtypesIfCUDA(get_all_fp_dtypes()) 1307: dtypes((get_all_dtypes(include_bool=False))) 1349: dtypes((get_all_fp_dtypes(include_half=False) + 1350: get_all_complex_dtypes())) 1351: dtypesIfCUDA((get_all_fp_dtypes(include_half=True) + 1352: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_view_ops.py`</summary> <p> ```python 19: get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 124: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 131: dtypes(get_all_dtypes(include_bfloat16=False)) 213: for view_dtype in [get_all_fp_dtypes(), get_all_complex_dtypes()]: 220: dtypes(get_all_dtypes()) 224: for view_dtype in get_all_dtypes(): 305: dtypes(get_all_complex_dtypes(include_complex32=True)) 343: dtypes(get_all_dtypes()) 354: dtypes(get_all_dtypes()) 364: dtypes(get_all_dtypes()) 374: dtypes(get_all_dtypes()) 384: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 395: dtypes(get_all_complex_dtypes()) 426: dtypes(get_all_complex_dtypes()) 451: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 1263: dtypes((torch.testing.get_all_dtypes())) 1279: dtypes((torch.testing.get_all_dtypes())) 1405: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1406: get_all_complex_dtypes())) 1471: dtypes(get_all_dtypes(include_bfloat16=False)) 1574: dtypes(get_all_dtypes()) 1601: dtypes(get_all_dtypes(include_bfloat16=False)) 1632: dtypes(*get_all_dtypes(include_bfloat16=False)) 1711: for dt in get_all_dtypes(): 1717: for dt in get_all_dtypes(): 1724: for dt in get_all_dtypes(): ``` </p> </details> I'm looking forward to your viewpoints. Thanks :) cc: mruberry kshitij12345 anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561 Reviewed By: samdow Differential Revision: D34856571 Pulled By: mruberry fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335 (cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)	2022-03-15 20:31:41 +00:00
Duncan Hill	0988dc481a	[Codemod][Codemod deprecated unittest asserts] fbcode//caffe2/test (#71708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71708 In Python 3.2, a number of asserts were deprecated. In Python 3.11, these asserts are deleted completely. The files in this change still use the deprecated asserts. Switch over to the supported syntax for 3.2 onwards. Test Plan: Tested on the internal test suite runner. Reviewed By: ajtulloch Differential Revision: D33503694 fbshipit-source-id: a150f296033260acf8365d77b837ce0679f57361 (cherry picked from commit abf60ed97409265222915d8265aaabedd625fd93)	2022-03-15 19:28:52 +00:00
Alban Desmaison	dff02851d1	Update tls logic to work better with guarded call (#73925 ) Summary: Description of the new behavior is in PythonFallbackKernel.cpp. The updated test makes sure that we only call alias on the first Tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73925 Reviewed By: samdow Differential Revision: D34862940 Pulled By: albanD fbshipit-source-id: 4d020e41c8bb8b10262dcafd524e84a5ad4d7af0 (cherry picked from commit 0aa6b56dbd3dcee830453fb02cd6c83ab7a8be06)	2022-03-14 21:26:31 +00:00
Alban Desmaison	b2a5507654	Fix deadlock in some edge case in autograd (#73961 ) Summary: Minimal example that deadlocks before but not after: ```python import torch from torch.autograd import Function class Foo(Function): staticmethod def forward(ctx, x): return x.clone() staticmethod def forward(ctx, gO): return gO.clone() def get_out(): inp = torch.rand(2, requires_grad=True) # The python function is first so that it runs # last in the backward pass right = Foo.apply(inp) # An op that creates new memory left1 = inp.clone() # An op that saves its input left2 = left1 ** 2 # Inplace modify so that the backward for # left2 always raises an error left1 += 1 # An op that takes both side as input. # After running, both side's last op will be in # the ready queue # And the op for left will run first as it was # executed last during the forward out = left2 + right return out # Nothing should be global variables here as, from what # I can see, python leaks all the global objects get_out().sum().backward() ``` Since this requires the python interpreter to die, it is hard to test in CI. Let me know if you have an idea how to do it though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73961 Reviewed By: malfet Differential Revision: D34752747 Pulled By: albanD fbshipit-source-id: 1a537b1f733e161e8d3ff053cd432b37b34d432a (cherry picked from commit 17943e4c04c782d81deab439e010195f04e75bbd)	2022-03-09 20:42:15 +00:00
soulitzer	15df909d34	Move autograd functional tests to separate file (#73852 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73852 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D34703586 Pulled By: soulitzer fbshipit-source-id: 58e8b17ab3dc41ce7bf15bb32ea0653d90f44791 (cherry picked from commit 526ab20fd6026144171bf3b02a5381da57ca9f91)	2022-03-08 23:45:34 +00:00
Peter Bell	9ef5c679ef	record_function: add torchbind alternative API (#72301 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72301 First step in resolving #35026. This adds `PythonRecordFunction` which is a `torch::CustomClassHolder` for `at::RecordFunction` to keep the ATen code free of torch includes. And adds new unused internal API functions `_record_function_enter_new` which return the torchbind object. Once the FC period is expired, `torch.profiler.record_function` will be updated to use this new internal API. Then once BC period is expired, the cpp_custom_type_hack-based API can be removed. Test Plan: Imported from OSS Reviewed By: dagitses Differential Revision: D34586311 Pulled By: robieta fbshipit-source-id: d3eb9ffad7b348548a2b22c75203a92d1cb5115b (cherry picked from commit 92d2ca808e5fbd20c9d6645dcabc3f059f9ef2d3)	2022-03-08 03:26:27 +00:00
anjali411	086645ad77	Update __torch_dispatch__ to return op overload instead of the opoverload packet function (#72673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72673 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D34627164 Pulled By: anjali411 fbshipit-source-id: 3cb6406a392d530bf9da36b4d8e0a62b30e6497e (cherry picked from commit 65b85a0a67df4d0f16ac8964e2b685d478a610fb)	2022-03-07 22:38:42 +00:00
Philip Meier	b5f2574f36	no longer coalesce sparse COO tensors before comparison (#69751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69751 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262453 Pulled By: ezyang fbshipit-source-id: e2e62d2aa03fc569d2951c880960b256f5dc4aaa (cherry picked from commit `cb6b0ef719`)	2022-02-17 02:33:08 +00:00
Alban Desmaison	a877441494	Clean up use of cpu ready queue in autograd engine (#72688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72688 Refactor how we know what to run on the cpu queue. The Lazy Tensor moved there as it is always present as a device guard and would make the number of devices 1 all the time (forcing the creation of a thread). FYI wconstab you most likely don't care about this unless you ever use multiple Lazy device? This should slightly improve the perf if you run backward with Lazy Tensors as the work will be done in the main thread and not a worker thread. Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D34180245 Pulled By: albanD fbshipit-source-id: 88c5d5bdd631ad01bf271d720d1eab69aba84fc0 (cherry picked from commit `da7e9b902f`)	2022-02-12 01:52:56 +00:00
Ivan Yashchuk	fb7c4780f9	Add autograd tests for addmm, addmv, mm, mv and CSR matrix input (#71949 ) Summary: This PR adds autograd tests for `addmm, addmv, mm, mv` functions that check computing derivatives wrt dense inputs. Currently, neither autograd engine, nor gradcheck can work with CSR inputs<->CSR outputs. I added xfailing tests for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71949 Reviewed By: george-qi Differential Revision: D33834653 Pulled By: cpuhrsch fbshipit-source-id: 4144c1547427d4cd6b01495cf45242bb4e914e86 (cherry picked from commit `2cb362283d`)	2022-02-11 23:14:02 +00:00
soulitzer	91e4f7788c	Gradcheck forward AD respects requires grad but run with requires_grad=False (#72309 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72309 Fixes: https://github.com/pytorch/pytorch/issues/72113 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33991570 Pulled By: soulitzer fbshipit-source-id: 610de162e9848d2d3b12e0fb039860fd9dee844f (cherry picked from commit `a7ecb13610`)	2022-02-10 03:30:40 +00:00
soulitzer	e39bf13316	Fix internal assert custom function when input does not require grad (#72008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72008 Fixes #71119 Technically BC-breaking because when an input does not require grad, previously it was returned as-is instead of a view because it didn't need to. Now we will also return a view in that case (whether or not forward AD runs). Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33859553 Pulled By: soulitzer fbshipit-source-id: 81b3fa371f4c0904630878500aa190492c562367 (cherry picked from commit `ee74bc8234`)	2022-02-01 22:36:04 +00:00
Richard Zou	5735f2f875	Make detach redispatch like a regular PyTorch operator (#71707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71707 Why? - detach should behave like jax.stop_gradient in functorch. Because it does not detach all the way through, functorch (as well as a Tensor Subclass wrapping a Tensor subclass) won't see it after the first layer/subclass handles it. How? - This PR changes detach to dispatch all the way through to the backend. - This PR also modifies native::detach to call shallow_copy_and_detach instead of native::alias. This is because today, the semantics of detach and alias are differently -- they differ only by allow_tensor_metadata_change. In the future, we may choose to deprecate this flag. - NB: Before and after this PR, detach() shows up twice in torch_dispatch: https://github.com/pytorch/pytorch/issues/71725. This is not a regression so I didn't want to fix it in this PR because it is weird to fix. Test Plan: - added new tests; run existing tests Reviewed By: albanD Differential Revision: D33752860 Pulled By: zou3519 fbshipit-source-id: 40cc2dc8232e75a02586a4ba5b0ef5f16cb76617 (cherry picked from commit `f88aae426e`)	2022-01-28 16:13:36 +00:00
lezcano	84f1685397	Rewrite svd and linalg.svd as structured kernels (#69827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69827 In general, the current pattern allows for implementing optimisations for all the backends in a common place (see for example the optimisation for empty matrices). After this PR, `torch.svd` is implemented in terms of `linalg.svd` and `linalg.svdvals`, as expected. This makes it differentiable in the case when `compute_uv=False`, although this is not particularly important, as `torch.svd` will eventually be deprecated. This PR also instantiates smaller `U` / `V` when calling cusolver_gesvdj in the cases when `full_matrices=False` or `compute_uv=False`. The memory for auxiliary `U` and `V` in the cases above, needed for some cuSOLVER routines is allocated raw allocators rather than through fully fledged tensors, as it's just a blob of memory the algorithm requests. As the code is better structured now, it was easier to see that `U` and `Vh` needn't be allocated when calling `svd_cusolver_gesvd`. Now `linalg.svdvals` work as expected wrt the `out=` parameter. Note that in the test `test_svd_memory_allocation` we were passing a tensor of the wrong size and dtype and the test seemed to pass... This PR also changes the backward formula to avoid saving the input matrix, as it's not necessary. In a follow up PR, I will clean the backward formula and make it more numerically stable and efficient. This PR also does a number of memory optimisations here and there, and fixes the call to cusolver_gesvd, which were incorrect for m <= n. To test this path, I compiled the code with a flag to unconditionally execute the `if (!gesvdj_convergence_check.empty())` branch, and all the tests passed. I also took this chance to simplify the tests for these functions in `test_linalg.py`, as we had lots of tests that were testing some functionality that is already currently tested in the corresponding OpInfos. I used xwang233's feature to test both MAGMA and CUDA backends. This is particularly good for SVD, as cuSOLVER is always chosen over MAGMA when available, so testing MAGMA otherwise would be tricky. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33751983 Pulled By: mruberry fbshipit-source-id: 11d48d977946345583d33d14fb11a170a7d14fd2 (cherry picked from commit `a1860bd567`)	2022-01-27 18:38:30 +00:00
anjali411	de8d0203e9	Allow torch.Tensor.real on real-valued tensors (#71718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71718 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33770668 Pulled By: anjali411 fbshipit-source-id: bad21ebe72220b9017a0b8efa71eaeab84bd9e9f (cherry picked from commit `aa0a922757`)	2022-01-25 22:30:48 +00:00
soulitzer	7a0c97195f	Add save_for_forward to custom function (#71569 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71569 Not sure if this is the right API Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695395 Pulled By: soulitzer fbshipit-source-id: 652b5758f15d901f98ff0da94e977030c7f3415b (cherry picked from commit `9421a6846a`)	2022-01-25 07:30:46 +00:00
soulitzer	09aeadf4ab	Fix custom function forward AD internal assert (#71531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71531 Based on the comment above the original internal assert, this is the desired check. 1. Don't error, and automatically make jvp return a view for that tensor output (this is easier than I originally thought: https://github.com/pytorch/pytorch/pull/71531#discussion_r789211877) 2. Error (currently doing) Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695399 Pulled By: soulitzer fbshipit-source-id: dba49890a55ad1dd59ed5c41faa96bf7cfc9e562 (cherry picked from commit `fdb0f266f5`)	2022-01-25 07:30:46 +00:00
soulitzer	1cc3291716	Fix custom function when non tensor argument precedes tensor argument (#71530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71530 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33695397 Pulled By: soulitzer fbshipit-source-id: 49ccd062f73ccf69c47aca2552fde182d582be2a (cherry picked from commit `68d502a013`)	2022-01-25 07:30:46 +00:00
Victor Quach	a3b7dd7b78	Enable nested default hooks (#70932 ) Summary: When default hooks are set, they are pushed onto a stack. When nesting context-manager, only the inner-most hooks will be applied. There is special care needed to update the TLS code. See also https://github.com/pytorch/pytorch/issues/70940 (i.e. do we need to be storing the enabled flag as well?) Fixes https://github.com/pytorch/pytorch/issues/70134 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70932 Reviewed By: mruberry Differential Revision: D33530370 Pulled By: albanD fbshipit-source-id: 3197d585d77563f36c175d3949115a0776b309f4	2022-01-11 15:03:49 -08:00
soulitzer	7397683b57	Add forward AD formulas for mv, scatter_add, _s_where (#70468 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70468 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33405364 Pulled By: soulitzer fbshipit-source-id: 7681c33fb264a7a3ec6436ebb7c5bb07cd5ffc3d	2022-01-10 13:54:10 -08:00
Mike Ruberry	84b7832010	Updates CUDA memory leak check to verify against driver API and print more diagnostic information (#69556 ) Summary: Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/69556 Reviewed By: mrshenli Differential Revision: D32954770 Pulled By: mruberry fbshipit-source-id: a6c2ae6f704422c178569980ca4b9c72c4272f55	2021-12-17 23:37:49 -08:00
soulitzer	51033ec840	Add forward AD layout check for storage numel (#68631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68631 This PR: - Adds the check that the storage numel of the base and tangent tensors are the same. This is to support the case when as_strided reveals elements that aren't indexable by the input tensor. - Skips the check when batched tensors are involved, because using as_strided to reveal elements that not indexable by the input tensor is already not allowed vmap. - Adds tests for the above two cases, as well as an edge case regarding conj bit (what about neg bit?) For functorch: - we need to copy the batching rule implemented here Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32899678 Pulled By: soulitzer fbshipit-source-id: 54db9550dd2c93bc66b8fb2d36ce40799ebba794	2021-12-14 04:34:25 -08:00
soulitzer	af7ee9fc01	Forward AD for inplace comparison operators (#69597 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69597 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33020600 Pulled By: soulitzer fbshipit-source-id: 0c9ab210f7dc952a41fbcaa1f5f7921c2fdeb18b	2021-12-12 00:11:14 -08:00
soulitzer	baf92f9d5a	Fix copy_ forward AD to handle broadcasting (#69592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69592 Currently, forward AD function for`copy_` (in `VariableTypeManual`) does not handle the broadcasting case. ~EDIT: but that is not a design decision, not a bug. In this PR, we make that clear as a comment.~ Note: `broadcast_to` does not have a batching rule in core, so the ops that rely on `copy_` to broadcast will still fail batched forward grad computation. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33020603 Pulled By: soulitzer fbshipit-source-id: 09cb702bffc74061964a9c05cfef5121f8164814	2021-12-12 00:11:08 -08:00
milesial	0ccb1dcdbb	Fix inference_mode decorator (#68617 ) Summary: This fixes the case when `torch.inference_mode` is called with `mode=False` (disabled). When used as a decorator, it ignored the argument and enabled inference mode anyway. `_DecoratorContextManager` is changed so that a new instance is a copy instead of a new instance with default parameters. I also added more tests to cover this case. Current behaviour: ```python >>> import torch >>> x = torch.ones(1, 2, 3, requires_grad=True) >>> torch.inference_mode(mode=False) ... def func(x): ... return x * x ... >>> out = func(x) >>> out.requires_grad False ``` New behaviour (fixed): ```python >>> import torch >>> x = torch.ones(1, 2, 3, requires_grad=True) >>> torch.inference_mode(mode=False) ... def func(x): ... return x * x ... >>> out = func(x) >>> out.requires_grad True ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/68617 Reviewed By: mrshenli Differential Revision: D32958434 Pulled By: albanD fbshipit-source-id: 133c69970ef8bffb9fc9ab5142dedcffc4c32945	2021-12-09 10:45:09 -08:00

1 2 3 4 5 ...

879 Commits