pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	d98a884b33	Revert "[cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669 )" This reverts commit `3c6bddc3f6`. Reverted https://github.com/pytorch/pytorch/pull/87669 on behalf of https://github.com/eqy due to investigating convnext benchmark regressions	2022-11-08 19:04:25 +00:00
Edward Z. Yang	860e354d1c	Support diag_embed.out decomposition (#88671 ) This is a little tricky: there is a diag_embed.out, but its not bound in Python because it's autogenerated, see https://github.com/pytorch/pytorch/issues/88598 So I can't "just" add the out variant to the ref, as this makes it inconsistent with the torch API. To workaround this, I mark the ref as supporting out, but not the original function. This is useful to do, because it means that diag_embed.out now supports symbolic shapes. However, this cannot be easily tested because I can't mark the out variant as being supported in the normal OpInfo test. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88671 Approved by: https://github.com/mruberry	2022-11-08 18:28:36 +00:00
Kurt Mohler	ee28b865ee	Deprecate TypedStorage, its derived classes, and all of their public methods (#85303 ) Part of #85302 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303 Approved by: https://github.com/ezyang	2022-11-08 18:11:01 +00:00
Rodrigo Kumpera	6663ae5537	[2/n] Thread PG: add class _World to distributed_c10d.py (#781 ) (#88471 ) Summary: X-link: https://github.com/pytorch/torchrec/pull/781 Move a bunch of globals to instance methods and replace all use to them. We move all PG related globals under World and use a singleton instance under _world. This creates an undocumented extension point to inject full control of how how c10d state behaves. One simple hack is to change _world to an implementation that uses a threadlocal and enable per-thread PGs. It almost get DDP working and the PG is missing an implementation of all_reduce. This enables notebook usage of PTD, which is a big deal for learning it: https://gist.github.com/kumpera/32cb051fa26b8cad8bdf671f968dcd68 This change ensures BC by keeping the global variables around and have the default _World wrap it. I have relinked this diff to a new github PR, so that I can update it. The original PR is > Pull Request resolved: https://github.com/pytorch/pytorch/pull/86348 Differential Revision: D40236769 Pulled By: yhcharles Pull Request resolved: https://github.com/pytorch/pytorch/pull/88471 Approved by: https://github.com/gnadathur, https://github.com/rohan-varma	2022-11-07 17:56:40 +00:00
Brian Hirsh	a16ced03c9	reland "fix as_strided_scatter_backward (#87646 )" (#88342 ) This reverts commit `71fb763e54`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88342 Approved by: https://github.com/zou3519	2022-11-07 15:00:58 +00:00
PyTorch MergeBot	81042d3a53	Revert "Reenable optimizer overlap tests (#88439 )" This reverts commit `da452bcadb`. Reverted https://github.com/pytorch/pytorch/pull/88439 on behalf of https://github.com/huydhn due to This change breaks trunk due to a land race missing reason parameter to sandcastle_skip_if `da452bcadb`	2022-11-06 02:29:53 +00:00
Nikita Karetnikov	bbaa0637df	Add error inputs to `gaussian_nll_loss` `OpInfo` (#88486 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88486 Approved by: https://github.com/lezcano	2022-11-05 20:10:54 +00:00
Rohan Varma	da452bcadb	Reenable optimizer overlap tests (#88439 ) Closes https://github.com/pytorch/pytorch/issues/73259. Not sure the root cause but CI seems fine with these tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88439 Approved by: https://github.com/awgu	2022-11-05 18:26:01 +00:00
Andrew M. James	ff6770a9a1	enable backward for log1p (sparse layouts) (#88155 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88155 Approved by: https://github.com/cpuhrsch	2022-11-04 20:59:26 +00:00
Andrew M. James	6938dd0b2c	Support sparse inputs to deg2rad (#88156 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88156 Approved by: https://github.com/cpuhrsch	2022-11-04 20:59:26 +00:00
Andrew M. James	f03302ba49	Add sparse layout support for torch.frac (#88153 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88153 Approved by: https://github.com/cpuhrsch	2022-11-04 20:59:22 +00:00
Catherine Lee	d632d94cc7	Disable mem leak check (#88373 ) tbh at this point it might be easier to make a new workflow and copy the relevant jobs... Changes: * Disable cuda mem leak check except for on scheduled workflows * Make pull and trunk run on a schedule which will run the memory leak check * Periodic will always run the memory leak check -> periodic does not have parallelization anymore * Concurrency check changed to be slightly more generous Pull Request resolved: https://github.com/pytorch/pytorch/pull/88373 Approved by: https://github.com/ZainRizvi, https://github.com/huydhn	2022-11-04 20:47:42 +00:00
PyTorch MergeBot	8c1c6759b2	Revert "remove assert_allclose from torch.testing (#87974 )" This reverts commit `5669e10d37`. Reverted https://github.com/pytorch/pytorch/pull/87974 on behalf of https://github.com/mehtanirav due to Internal breakages from method removal	2022-11-04 19:12:37 +00:00
Will Constable	70b00b1383	Add hf_bert + DDP multigpu test (#88435 ) Spot-checks an e2e model working with ddp. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88435 Approved by: https://github.com/davidberard98	2022-11-04 03:17:48 +00:00
soulitzer	4c20c0509d	Split out forward AD tests from test_ops_gradients and reenable slow gradcheck CI (#88216 ) Fixes: https://github.com/pytorch/pytorch/issues/88010 This PR does a couple things to stop slow gradcheck from timing out: - Splits out test_ops_fwd_gradients from test_ops_gradients, and factors out TestFwdGradients and TestBwdGradients which both inherit from TestGradients, now situated in common_utils (maybe there is a better place?) - Skips CompositeCompliance (and several other test files) for slow gradcheck CI since they do not use gradcheck - because test times for test_ops_fwd_gradients and test_ops_gradients are either unknown or wrong, we hardcode them for now to prevent them from being put together. We can undo the hack after we see actual test times are updated. ("def calculate_shards" randomly divides tests with unknown test times in a round-robin fashion.) - Updates references to test_ops_gradients and TestGradients - Test files that are skipped for slow gradcheck CI are now centrally located in in run_tests.py, this reduces how fine-grained we can be with the skips, so for some skips (one so far) we still use the old skipping mechanism, e.g. for test_mps Pull Request resolved: https://github.com/pytorch/pytorch/pull/88216 Approved by: https://github.com/albanD	2022-11-03 00:20:45 +00:00
Huy Do	5b882a34c4	Consolidate macos pip dependencies (#88071 ) After conda, consolidating all macos pip dependencies to cache every dependencies that macos CI needs. Two small issues are found along the way in `_mac-test-mps` workflow: * It didn't have `Install macOS homebrew dependencies` to install libomp like the regular `_mac-test` workflow * It didn't install `scipy`, thus silently skipping some `signal.windows` tests Both are fixed in this PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/88071 Approved by: https://github.com/malfet	2022-11-02 17:22:01 +00:00
PyTorch MergeBot	71fb763e54	Revert "fix as_strided_scatter_backward (#87646 )" This reverts commit `f9d7985851`. Reverted https://github.com/pytorch/pytorch/pull/87646 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but I think this one or one of the PR in the stack break bionic-cuda11.7 on trunk `70782981f0`	2022-11-02 16:54:36 +00:00
Brian Hirsh	f9d7985851	fix as_strided_scatter_backward (#87646 ) as_strided_scatter's derivative formula was broken - instead of making a "mask" of 1's and 0's, it would effectively make a mask of 1's and uninitialized memory. Fixes https://github.com/pytorch/pytorch/issues/88105 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87646 Approved by: https://github.com/albanD	2022-11-02 14:36:49 +00:00
Philip Meier	5669e10d37	remove assert_allclose from torch.testing (#87974 ) See #87969 or #86586 for the reasoning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87974 Approved by: https://github.com/mruberry	2022-11-02 14:05:01 +00:00
Philip Meier	b9c617838a	remove make_non_contiguous from torch.testing (#87973 ) See #87969 or #86586 for the reasoning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87973 Approved by: https://github.com/mruberry	2022-11-02 14:05:01 +00:00
Philip Meier	8893c6cd07	remove deprecated dtype getters from torch.testing (#87972 ) See #87969 or #86586 for the reasoning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87972 Approved by: https://github.com/mruberry	2022-11-02 14:04:58 +00:00
Philip Meier	a360be50b5	remove deprecated device getter from torch.testing (#87971 ) See #87969 or #86586 for the reasoning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87971 Approved by: https://github.com/mruberry	2022-11-02 14:04:54 +00:00
Philip Meier	554cdc9a63	remove deprecated rand and randn from torch.testing (#87970 ) See #87969 or #86586 for the reasoning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87970 Approved by: https://github.com/mruberry	2022-11-02 14:04:51 +00:00
Philip Meier	bc73affdad	prepare removal of deprecated functionality in torch.testing (#87969 ) _Redo of #86586 with all BC breaking changes granularly placed into separate commits._ --- Per title. Deprecation happened on Feb 25, 2022 in `c6f1bbc0ac`, which made it into the 1.12 release. Since it is now 245 days later and the next release will be 1.14, the removals later in the stack comply with the [BC policy](https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#minimizing-the-disruption-of-bc-breaking-changes). Pull Request resolved: https://github.com/pytorch/pytorch/pull/87969 Approved by: https://github.com/mruberry	2022-11-02 14:04:48 +00:00
Andrew Gu	d6b58d6924	[FSDP()][23/N] Refactor handle attr initialization (#87938 ) `_init_param_attributes()` -> `init_flat_param_attributes()` We move `_init_param_attributes()` to `FlatParamHandle.init_flat_param_attributes()` (as already marked as to-do during previous refactoring). `_reset_lazy_init()` We no longer delete `_local_shard` from each `FlatParameter` in `_reset_lazy_init()`. Analysis Thus, the two semantic differences are that we remove the initial `if hasattr(p, "_local_shard")` early return in `_init_param_attributes()` and the `delattr(p, "_local_shard")` in `_reset_lazy_init()`. This is safe because - If we never call `_reset_lazy_init()`, then `init_flat_param_attributes()` is only called once. There is no opportunity for an early return. - If we call `_reset_lazy_init()`, then `init_flat_param_attributes()` will be called again in the next `_lazy_init()`. However, since we removed the early return, all of the attributes initialized in `init_flat_param_attributes()` simply get re-initialized and override any existing attributes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87938 Approved by: https://github.com/mrshenli	2022-11-02 11:32:56 +00:00
lezcano	39d9d2ed70	Implement reference for lerp (#87424 ) We follow the vectorised CPU implementation for numerical accuracy Pull Request resolved: https://github.com/pytorch/pytorch/pull/87424 Approved by: https://github.com/ezyang	2022-11-02 11:21:01 +00:00
Kazuaki Ishizaki	2ddefbdc3c	Fix typos used in documents under torch directory (#88300 ) This PR fixes typos, in comments of Python files, that are found from a search box at https://pytorch.org/docs/master/search.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/88300 Approved by: https://github.com/lezcano	2022-11-02 09:38:13 +00:00
Kshiteej K	e763b7abeb	[complex] conv_transpose3d : complex support (#87967 ) Reference: https://github.com/pytorch/pytorch/issues/71108 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87967 Approved by: https://github.com/anjali411	2022-11-02 06:37:33 +00:00
eqy	3c6bddc3f6	[cuDNN] (re-open) Enable cuDNN Frontend v8 API by Default (#87669 ) #58414 Has a small tweak to a test that was breaking on A10 (CC @malfet). CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/87669 Approved by: https://github.com/ngimel	2022-11-02 01:36:37 +00:00
Peter Bell	dfa9475755	Check SM version before calling flash attention with BFloat16 (#86600 ) The flash attention code path requires sm80 or newer to run on BFloat16, so any OpInfo tests running with BFloat16 would fail with the error: ``` RuntimeError: Expected q_dtype == at::kHalf \|\| (is_sm8x && q_dtype == at::kBFloat16) to be true, but got false. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/86600 Approved by: https://github.com/ngimel	2022-11-02 00:52:30 +00:00
Peter Bell	bc9caafc78	record_function: update to use custom_class API (#76420 ) Re-submit of gh-72302 This still has a small performance hit, but it much smaller. On my machine I see `_record_fucntion_exit._RecordFunction` takes 1.05 us compared to the `Tensor` overload taking 0.79 us. In an overall comparison, I see a 0.7 us slowdown from 6.0 us to 6.7 us for this timeit benchmark ```python import torch def foo(): with torch.profiler.record_function("foo"): return torch.eye(3) %timeit foo() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76420 Approved by: https://github.com/robieta	2022-11-02 00:39:28 +00:00
Andrew M. James	d044b4cc58	Update torch.abs and torch.positive opinfos to reflect sparse support (#88151 ) cc @nikitaved @pearu @cpuhrsch @bhosmer Pull Request resolved: https://github.com/pytorch/pytorch/pull/88151 Approved by: https://github.com/cpuhrsch	2022-11-01 22:18:56 +00:00
Yanli Zhao	44f8efd5c1	[BE]fix DDP when the number of output features is zero (#87793 ) Fixes #87280 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87793 Approved by: https://github.com/rohan-varma	2022-11-01 15:27:40 +00:00
Andrew Gu	b1750d0440	[FSDP()][13/N] Refactor unshard/reshard/grads (#87926 ) This PR is not too complicated. We just move unshard/reshard/grads out to `_runtime_utils.py` and make them take `state: _State` instead of `self`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87926 Approved by: https://github.com/mrshenli	2022-11-01 13:37:31 +00:00
Sean Ross-Ross	1a9edc8136	Changing from sample_inputs to reference_inputs in test_compare_cpu (#86462 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86462 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-31 20:06:03 +00:00
Edward Z. Yang	ff94494644	Revert "Revert "Unify meta tensor and fake tensor converter conversion (#87943 )"" (#88045 ) This reverts commit `bc64999b83`. Check torch/_subclasses/meta_utils.py for "This is very tricky" for the bugfix explanation. cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx Pull Request resolved: https://github.com/pytorch/pytorch/pull/88045 Approved by: https://github.com/kit1980, https://github.com/Chillee	2022-10-31 17:50:14 +00:00
Fuzzkatt	d13f1e6ab4	Add sequence number support for UCC (#85047 ) Add sequence number support for UCC, mostly following format of ProcressGroupNCCL. Pass new test: `test_all_gather_object_subgroup` Add skips for gather tests: `test_gather_object` and `test_gather_object_subgroup` cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu Pull Request resolved: https://github.com/pytorch/pytorch/pull/85047 Approved by: https://github.com/kwen2501	2022-10-31 03:56:55 +00:00
PyTorch MergeBot	bc64999b83	Revert "Unify meta tensor and fake tensor converter conversion (#87943 )" This reverts commit `baa715e790`. Reverted https://github.com/pytorch/pytorch/pull/87943 on behalf of https://github.com/kit1980 due to Broke several inductor tests	2022-10-29 18:39:28 +00:00
Edward Z. Yang	baa715e790	Unify meta tensor and fake tensor converter conversion (#87943 ) Meta tensor does a lot of work to make sure tensors "look" similar to the original parts; e.g., if the original was a non-leaf, meta converter ensures the meta tensor is a non-leaf too. Fake tensor destroyed some of these properties when it wraps it in a FakeTensor. This patch pushes the FakeTensor constructor into the meta converter itself, so that we first create a fake tensor, and then we do various convertibility bits to it to make it look right. The two tricky bits: - We need to have no_dispatch enabled when we allocate the initial meta tensor, or fake tensor gets mad at us for making a meta fake tensor. This necessitates the double-callback structure of the callback arguments: the meta construction happens inside the function so it is covered by no_dispatch - I can't store tensors for the storages anymore, as that will result in a leak. But we have untyped storage now, so I just store untyped storages instead. Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx Pull Request resolved: https://github.com/pytorch/pytorch/pull/87943 Approved by: https://github.com/eellison, https://github.com/albanD	2022-10-29 15:01:07 +00:00
Andrew Gu	e667c00656	[FSDP()][2/N] Refactor training state (#87916 ) This PR actually has meaningful changes. We stratify `TrainingState` into two levels: one is per FSDP instance and one is per `FlatParamHandle`/`FlatParameter`. - At the FSDP instance level, we only care about `IDLE`, FSDP computation (i.e. `FORWARD_BACKWARD`), or `SUMMON_FULL_PARAMS`. These dynamically modify behavior (e.g. `summon_full_params()` forces full precision). - At the `FlatParamHandle` level, we care about the training state for invariants and debugging. Hence, we keep `IDLE`, `FORWARD`, `BACKWARD_PRE`, `BACKWARD_POST`, and `SUMMON_FULL_PARAMS`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87916 Approved by: https://github.com/mrshenli	2022-10-29 06:50:30 +00:00
Sherlock Huang	e8a97a3721	FakeTensorMode and Prims.add/sub/mul/div support scalar only inputs (#87759 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87759 Approved by: https://github.com/ngimel, https://github.com/mruberry, https://github.com/eellison	2022-10-28 04:34:25 +00:00
lezcano	f21d0b310c	Add decomposition for diagonal_scatter (#87282 ) cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87282 Approved by: https://github.com/mruberry	2022-10-28 00:50:29 +00:00
Alvaro Gaona	46b16977d9	Reimplement Kaiser window (#87330 ) Relates to #85366 - For reference follow #87082. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87330 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-27 21:01:01 +00:00
Natalia Gimelshein	f1b78224ca	Fix type promotion for 2 wrapped scalar args (#87845 ) Fixes #76801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87845 Approved by: https://github.com/SherlockNoMad, https://github.com/mruberry	2022-10-27 15:53:11 +00:00
kshitij12345	1780e0ef7f	[complex] conv_transpose2d (#81805 ) Reference: https://github.com/pytorch/pytorch/issues/71108 Fixes : #86414 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81805 Approved by: https://github.com/anjali411	2022-10-27 10:46:53 +00:00
Andrew Gu	107f92a683	[FSDP] ufmt FSDP test (#87812 ) This applies `ufmt` to all of the FSDP test files in the `test/distributed/fsdp/` directory. Test Plan CI Notes For VSCode users, - Install `ufmt`: https://pypi.org/project/ufmt/ - Install VSCode `ufmt` extension: https://marketplace.visualstudio.com/items?itemName=omnilib.ufmt - Include in `settings.json`: ``` { "[python]": { "editor.defaultFormatter": "omnilib.ufmt", "editor.formatOnSave": true, }, } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/87812 Approved by: https://github.com/rohan-varma	2022-10-27 04:25:55 +00:00
wchen61	2c66889f90	Synchronize before change cuda stream (#82050 ) (#82056 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/82050 Need synchronize before change cuda stream ### Description <!-- What did you change and why was it needed? --> ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82056 Approved by: https://github.com/ngimel	2022-10-26 23:44:13 +00:00
Nikita Karetnikov	59b9d29260	[primTorch] Check `error_regex` in `test_python_ref_errors` (#86987 ) cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/86987 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-26 23:34:34 +00:00
jpvillam	38dd4cbdf1	ROCm enable sparse_sampled_addmm (#86401 ) Enables: test_comprehensive_sparse_sampled_addmm_cuda_complex128 test_comprehensive_sparse_sampled_addmm_cuda_complex64 test_comprehensive_sparse_sampled_addmm_cuda_float32 test_comprehensive_sparse_sampled_addmm_cuda_float64 test_dispatch_meta_sparse_sampled_addmm_cuda_complex128 test_dispatch_meta_sparse_sampled_addmm_cuda_complex64 test_dispatch_meta_sparse_sampled_addmm_cuda_float32 test_dispatch_meta_sparse_sampled_addmm_cuda_float64 test_meta_sparse_sampled_addmm_cuda_complex128 test_meta_sparse_sampled_addmm_cuda_complex64 test_meta_sparse_sampled_addmm_cuda_float32 test_meta_sparse_sampled_addmm_cuda_float64 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86401 Approved by: https://github.com/ngimel	2022-10-26 19:39:24 +00:00
Bill Schnurr	0367c12bce	Fix torch.testing.assert_close not exported from module (#87619 ) For pylance/pyright static typechecking "Imported symbols are considered private by default. If they use the “import A as A” (a redundant module alias), “from X import A as A” (a redundant symbol alias)" https://github.com/microsoft/pyright/blob/main/docs/typed-libraries.md#library-interface torch.testing.assert_close not exported from module https://github.com/microsoft/pylance-release/issues/3526 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/87619 Approved by: https://github.com/kit1980	2022-10-25 04:47:13 +00:00

1 2 3 4 5 ...

3340 Commits