pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
pritam	a81be44410	Fix `shard_module` to appropriately deal with sub process groups. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79264 `shard_module` API didn't work correctly with a sub-pg since `dist.scatter` actually takes the global rank as input for `src`. Fixing this by passing in the appropriate rank to `dist.scatter` Differential Revision: [D37062766](https://our.internmc.facebook.com/intern/diff/D37062766/) Approved by: https://github.com/fduwjj, https://github.com/wanchaol	2022-06-12 03:50:45 +00:00
Mikayla Gawarecki	1ec30a6647	Add offsets-based reduction to segment_reduce (CPU, CUDA) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78907 Approved by: https://github.com/cpuhrsch	2022-06-11 17:43:42 +00:00
Michael Suo	c978b609f7	[ci] remove IN_CI env var The conventional env var to set is CI. Both circle and GHA set it, so IN_CI is unnecessary Pull Request resolved: https://github.com/pytorch/pytorch/pull/79229 Approved by: https://github.com/janeyx99	2022-06-11 17:16:30 +00:00
Michael Suo	f51d5233f2	[ci] fix GITHUB_ACTIONS env var checks `GITHUB_ACTIONS` is set to `true`, but some of our code checks that it is `1`. Make the checks more general. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79290 Approved by: https://github.com/janeyx99	2022-06-11 17:16:30 +00:00
George Qi	164029f783	masked logasumexp/logaddexp Pull Request resolved: https://github.com/pytorch/pytorch/pull/78291 Approved by: https://github.com/cpuhrsch	2022-06-11 05:46:36 +00:00
lezcano	54949a5abc	Simplify and optimize linalg.solve This PR heavily simplifies the code of `linalg.solve`. At the same time, this implementation saves quite a few copies of the input data in some cases (e.g. A is contiguous) We also implement it in such a way that the derivative goes from computing two LU decompositions and two LU solves to no LU decompositions and one LU solves. It also avoids a number of unnecessary copies the derivative was unnecessarily performing (at least the copy of two matrices). On top of this, we add a `left` kw-only arg that allows the user to solve `XA = B` rather concisely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74046 Approved by: https://github.com/nikitaved, https://github.com/IvanYashchuk, https://github.com/mruberry	2022-06-11 04:06:40 +00:00
Mikayla Gawarecki	e727539c29	Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CUDA) Pull Request resolved: https://github.com/pytorch/pytorch/pull/77061 Approved by: https://github.com/cpuhrsch	2022-06-11 01:45:22 +00:00
anjali411	38350acf8f	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322 Approved by: https://github.com/albanD	2022-06-11 00:29:32 +00:00
kshitij12345	5e656eaae5	[refs] ravel (#78421 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/78421 Approved by: https://github.com/mruberry	2022-06-10 20:20:13 +00:00
kshitij12345	3d77017674	[primTorch] refs: masked_fill (#78132 ) TODO * [x] Add error inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/78132 Approved by: https://github.com/mruberry	2022-06-10 20:19:48 +00:00
PyTorch MergeBot	b712467cd1	Revert "Add mutation checks for tensor inputs" This reverts commit `83c0a2bc38`. Reverted https://github.com/pytorch/pytorch/pull/79078 on behalf of https://github.com/davidberard98 due to broke bazel build-and-test, see [https://github.com/pytorch/pytorch/runs/6836001002?check_suite_focus=true](https://github.com/pytorch/pytorch/runs/6836001002?check_suite_focus=true%22)	2022-06-10 20:15:30 +00:00
goldenxuett	83c0a2bc38	Add mutation checks for tensor inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/79078 Approved by: https://github.com/davidberard98, https://github.com/Krovatkin	2022-06-10 18:17:33 +00:00
kshitij12345	adaecb2cbb	[chalf] index_select: cpu support (#79217 ) Fixes https://github.com/pytorch/pytorch/issues/79204 PR https://github.com/pytorch/pytorch/pull/78173 took care of adding CUDA support. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79217 Approved by: https://github.com/mruberry	2022-06-10 14:06:32 +00:00
pritam	b9e3d722c4	Use appropriate dtype for sharded linear implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79255 We use several collective operations in our sharded linear implementation and for many collectives, we do not set the `dtype` of the output tensor appropriately. As a result, using a datatype like torch.float16 (which is not the default torch.float32) results in errors. Fixing this across the board and adding appropriate tests. Differential Revision: [D37059752](https://our.internmc.facebook.com/intern/diff/D37059752/) Approved by: https://github.com/fduwjj, https://github.com/wanchaol	2022-06-10 07:32:15 +00:00
Kshiteej K	d837443a6f	[fix] composite compliance: matrix_rank (#78968 ) Ref: https://github.com/pytorch/pytorch/issues/69991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78968 Approved by: https://github.com/zou3519	2022-06-10 05:41:19 +00:00
PyTorch MergeBot	fefff54cad	Revert "Revert "Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads""" This reverts commit `a2d2981e8e`. Reverted https://github.com/pytorch/pytorch/pull/79224 on behalf of https://github.com/suo due to broke lots of things `a2d2981e8e`	2022-06-10 04:40:43 +00:00
Horace He	a2d2981e8e	Revert "Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads"" This reverts commit `d67309aefb`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79224 Approved by: https://github.com/mruberry	2022-06-10 03:07:14 +00:00
PyTorch MergeBot	87a5ecced2	Revert "Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CUDA)" This reverts commit `40f7ef1f3d`. Reverted https://github.com/pytorch/pytorch/pull/77061 on behalf of https://github.com/janeyx99 due to Broke segment_reduce tests on trunk, e.g., `40f7ef1f3d`	2022-06-10 01:57:34 +00:00
Mikayla Gawarecki	40f7ef1f3d	Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CUDA) Pull Request resolved: https://github.com/pytorch/pytorch/pull/77061 Approved by: https://github.com/cpuhrsch	2022-06-10 00:49:37 +00:00
Joel Benjamin Schlosser	70d6446a3d	Support both train / eval modes for ModuleInfo Pull Request resolved: https://github.com/pytorch/pytorch/pull/78735 Approved by: https://github.com/albanD	2022-06-09 20:57:17 +00:00
Olga Andreeva	b1ae519df9	Added functionality for post_local SGD (#78988 ) Fixes #74556 Added functionality to save and restore step counter for model averager. Added a unittest. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78988 Approved by: https://github.com/rohan-varma, https://github.com/awgu	2022-06-09 17:47:04 +00:00
lezcano	af6321f3d8	Port linalg_qr to structured This PR simplifies the logic of `linalg.qr` using structured kernels. I also took this chance and merged a few `copy_` operations with other ops. This PR removes a the previous magma implementation as is never faster than that of cusolver and it's rather buggy. This has the side-effect that now `qr` is not supported in Rocm. Ivan confirmed that this is fine, given how incredibly slow was QR on Rocm anyway (we were marking some tests as slow because of this...). This PR also corrects the dispatch in geqrf. Before, if we called it with a matrix for which `input.size(-2) <= 256 && batchCount(input) >= std::max<int64_t>(2, input.size(-2) / 16)` is false, and we have cublas but not cusolver, we would end up calling magma rather than cublas. This is not what the heuristic suggested. Probaly we should benchmark these heuristics again, but that's beyond the scope of this PR. Note. It looks like `torch.geqrf` maybe broken in MAGMA as per the previous comment in `linalg_qr_helper_magma`. IvanYashchuk wdyt? Pull Request resolved: https://github.com/pytorch/pytorch/pull/79054 Approved by: https://github.com/IvanYashchuk, https://github.com/ezyang	2022-06-09 14:41:30 +00:00
PyTorch MergeBot	d67309aefb	Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads" This reverts commit `64b6bd8c1e`. Reverted https://github.com/pytorch/pytorch/pull/79000 on behalf of https://github.com/malfet due to Introduces test failure, see https://hud.pytorch.org/pr/79000	2022-06-09 13:11:23 +00:00
PyTorch MergeBot	3556457dd2	Revert "`kl_div`: fix for grads wrt `target`, double backward, forward-over-reverse AD support. (#79007 )" This reverts commit `72ad222cff`. Reverted https://github.com/pytorch/pytorch/pull/79007 on behalf of https://github.com/janeyx99 due to Broke test_fn_fwgrad_bwgrad_nn_functional_kl_div_cpu_float64 on trunk https://hud.pytorch.org/minihud?name_filter=pull%20/%20linux-xenial-py3.7-clang7-asan%20/%20test%20(default,%202,%205,%20linux.2xlarge)	2022-06-09 13:07:03 +00:00
Pearu Peterson	fb6749d977	Support CSC/BSR/BSC inputs to unary zero-preserving functions. In addition, enable testing masked reductions in sparse compressed consistency check. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78173 Approved by: https://github.com/cpuhrsch	2022-06-09 09:46:34 +00:00
Pearu Peterson	2a0e4322e6	Support ComplexHalf in nonzero and add of sparse_csr input. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79062 Approved by: https://github.com/cpuhrsch	2022-06-09 09:46:33 +00:00
Nikita Vedeneev	72ad222cff	`kl_div`: fix for grads wrt `target`, double backward, forward-over-reverse AD support. (#79007 ) Fixes https://github.com/pytorch/pytorch/issues/78867, fixes https://github.com/pytorch/pytorch/issues/65466. Adds forward-over-reverse AD support. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79007 Approved by: https://github.com/soulitzer, https://github.com/jbschlosser	2022-06-09 09:06:52 +00:00
Peter Bell	cd9e158007	Accept non-standard bools in more CUDA kernels This fixes all remaining CUDA kernels, except those using `cub` or `thrust`, to accept boolean tensors with values other than 1 or 0. I do this by using `c10::load` in more places, and also adding a `load_vector` helper into `MemoryAccess.cuh` that does the same thing for vectorized loads. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78957 Approved by: https://github.com/mruberry	2022-06-09 08:31:28 +00:00
Horace He	64b6bd8c1e	Added {logical_not, trace} refs, moved logical ops to use method overloads Pull Request resolved: https://github.com/pytorch/pytorch/pull/79000 Approved by: https://github.com/ezyang	2022-06-09 07:16:36 +00:00
PyTorch MergeBot	854c833f81	Revert "Support both train / eval modes for ModuleInfo" This reverts commit `12658fcd5b`. Reverted https://github.com/pytorch/pytorch/pull/78735 on behalf of https://github.com/malfet due to Broke eval tests on Win, 10.2 and ROCM, see `12658fcd5b`	2022-06-09 03:37:55 +00:00
Horace He	dc11a5642d	Improved stack ref and added more decomposition annotations Pull Request resolved: https://github.com/pytorch/pytorch/pull/78994 Approved by: https://github.com/mruberry	2022-06-09 03:20:28 +00:00
Joel Benjamin Schlosser	12658fcd5b	Support both train / eval modes for ModuleInfo Pull Request resolved: https://github.com/pytorch/pytorch/pull/78735 Approved by: https://github.com/albanD	2022-06-08 23:20:17 +00:00
Kshiteej K	e85f3b58ab	[fix] composite compliance: margin_ranking_loss, hinge_embedding_loss (#78935 ) Ref: #69991 Cause of failure is similar to the one discussed for fixing forward_ad of `nn.functional.linear`: https://github.com/pytorch/pytorch/pull/77950#discussion_r878328822 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78935 Approved by: https://github.com/zou3519	2022-06-08 20:58:35 +00:00
Wei Wei	79ee0c0dd5	Swap fx2trt_oss to torch_tensorrt (#950 ) (#79115 ) Summary: X-link: https://github.com/pytorch/benchmark/pull/950 X-link: https://github.com/pytorch/fx2trt/pull/91 Reviewed By: yinghai Differential Revision: D36958046 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79115 Approved by: https://github.com/yinghai, https://github.com/842974287, https://github.com/malfet	2022-06-08 19:02:43 +00:00
Khushi Agrawal	5b32c34450	[reland][complex32, jiterator] cos, sinh, cosh, tanh (#78718 ) Ref: #78458 Follows: #74537 and #74748 cc @kshitij12345 @anjali411 :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78718 Approved by: https://github.com/anjali411, https://github.com/kshitij12345	2022-06-08 15:00:41 +00:00
Ivan Yashchuk	ff39e3493a	Test torch._refs with aten and nvfuser executors (#78926 ) This PR adds testing of references with "aten" and "nvfuser" executors using `torch._prims.executor.make_traced`. Many tests are skipped even for "aten" executor because of https://github.com/pytorch/pytorch/issues/78923. I limited the dtypes for the nvfuser executor tests because it's slow due to compilation overhead (it took about 30 mins in total). With `float32` and `int32` types nvfuser tests take 5 minutes. ``` 58 passed, 2507 skipped, 28162 deselected, 79 xfailed, 5 warnings in 297.58s (0:04:57) ``` 58 tests passed means that 29 references work correctly with nvfuser executor now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78926 Approved by: https://github.com/mruberry	2022-06-08 12:45:27 +00:00
Philip Meier	32593ef2dd	move MPS compat into common comparison machinery (#77836 ) Addresses https://github.com/pytorch/pytorch/issues/77144#issuecomment-1128168082. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77836 Approved by: https://github.com/albanD	2022-06-08 08:09:18 +00:00
soulitzer	99ffeff949	[forward ad] Sync conj for between primal and tangent on set forward grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/78358 Approved by: https://github.com/Lezcano, https://github.com/zou3519	2022-06-08 04:20:17 +00:00
lezcano	f7b9a46880	Deprecate torch.lu BC-breaking note: This PR deprecates `torch.lu` in favor of `torch.linalg.lu_factor`. A upgrade guide is added to the documentation for `torch.lu`. Note this PR DOES NOT remove `torch.lu`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77636 Approved by: https://github.com/malfet	2022-06-07 22:50:14 +00:00
PyTorch MergeBot	c8a5f28fde	Revert "Test torch._refs with aten and nvfuser executors (#78926 )" This reverts commit `d4eebca7bc`. Reverted https://github.com/pytorch/pytorch/pull/78926 on behalf of https://github.com/malfet due to breaks rocms, see `d4eebca7bc`	2022-06-07 22:39:05 +00:00
lezcano	c7d6cec078	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77634 Approved by: https://github.com/malfet	2022-06-07 22:28:28 +00:00
Omkar Salpekar	a07f57d44b	[fx2trt] support for new_ones, new_empty, as_strided, einsum (#79047 ) Fix the internal<>OSS divergence caused by D36460857 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79047 Approved by: https://github.com/frank-wei	2022-06-07 21:32:31 +00:00
Ivan Yashchuk	d4eebca7bc	Test torch._refs with aten and nvfuser executors (#78926 ) This PR adds testing of references with "aten" and "nvfuser" executors using `torch._prims.executor.make_traced`. Many tests are skipped even for "aten" executor because of https://github.com/pytorch/pytorch/issues/78923. I limited the dtypes for the nvfuser executor tests because it's slow due to compilation overhead (it took about 30 mins in total). With `float32` and `int32` types nvfuser tests take 5 minutes. ``` 58 passed, 2507 skipped, 28162 deselected, 79 xfailed, 5 warnings in 297.58s (0:04:57) ``` 58 tests passed means that 29 references work correctly with nvfuser executor now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78926 Approved by: https://github.com/mruberry	2022-06-07 20:34:07 +00:00
Mikayla Gawarecki	814ff74460	Add prod reduce option to segment_reduce + opinfo Pull Request resolved: https://github.com/pytorch/pytorch/pull/76067 Approved by: https://github.com/cpuhrsch	2022-06-07 17:06:07 +00:00
Peter Bell	c936396af2	Always convert truthy booleans to 1 Ref #54789 A `bool` has only two valid values, 1 or 0. Any in-memory value outside of those leads to undefined behavior. So, instead of `reinterpret_cast`-ing to `bool*` I introduce `c10::load<scalar_t>` which will read as `unsigned char` and convert to a valid `bool`. This gets >90% of operators working, but the remaining operators where skips and xfails have been added will require individual attention. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77122 Approved by: https://github.com/mruberry	2022-06-07 16:00:30 +00:00
Horace He	e675dbadc4	Ported gelu decomp to ref (#78697 ) Ugh... these are actually so painful to write without operator overloading lol. Decided to just utilize operator overloading, and xfail the ref tests for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78697 Approved by: https://github.com/mruberry	2022-06-06 22:30:20 +00:00
Edward Z. Yang	80f2c175be	Follow up on CR for "Replace TensorMeta with FakeTensor" See https://github.com/pytorch/pytorch/pull/78836 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/78895 Approved by: https://github.com/albanD	2022-06-06 22:20:40 +00:00
Khushi Agrawal	e7b96ad078	[complex32] sqrt-rsqrt : cuda (#77490 ) Follows #74537 cc @kshitij12345! Pull Request resolved: https://github.com/pytorch/pytorch/pull/77490 Approved by: https://github.com/ngimel	2022-06-06 20:53:54 +00:00
Kshiteej K	c461d8a977	[primTorch] refs: hsplit, vsplit (#78418 ) As per title TODO: * [x] Add error inputs (already exist) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78418 Approved by: https://github.com/mruberry	2022-06-06 19:54:05 +00:00
goldenxuett	1f53d036d2	Build a __torch_dispatch__ class that records torch operator names Pull Request resolved: https://github.com/pytorch/pytorch/pull/78835 Approved by: https://github.com/Gamrix	2022-06-06 16:39:46 +00:00

1 2 3 4 5 ...

2762 Commits