pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
CaoE	9399e0b1ff	add fp16 support for gemm (#99498 ) ### Testing Native matmul vs. mkldnn matmul on SPR (with avx512_fp16 support) single core: Input \| Naïve impl / ms \| oneDNN / ms \| Speed up -- \| -- \| -- \| -- M: 128, N: 128, K: 128, trans_a: False, trans_b: False \| 2010.387 \| 64.700 \| 31.072 M: 128, N: 256, K: 128, trans_a: False, trans_b: False \| 4027.116 \| 107.780 \| 37.364 M: 8192, N: 768, K: 768, trans_a: False, trans_b: False \| 28685868.488 \| 90663.008 \| 316.401 56 cores: Input \| Naïve impl / ms \| oneDNN / ms \| Speed up -- \| -- \| -- \| -- M: 128, N: 128, K: 128, trans_a: False, trans_b: False \| 5.091 \| 0.24 \| 211.30 M: 128, N: 128, K: 128, trans_a: False, trans_b: True \| 5.224 \| 0.23 \| 220.09 M: 128, N: 256, K: 128, trans_a: False, trans_b: False \| 10.006 \| 0.30 \| 330.31 M: 8192, N: 768, K: 768, trans_a: False, trans_b: False \| 29435.372 \| 1.770 \| 1662.80 M: 8192, N: 768, K: 768, trans_a: False, trans_b: True \| 31464.961 \| 1.728 \| 18204.76 M: 8192, N: 768, K: 3072, trans_a: False, trans_b: False \| 115035.849 \| 7.990 \| 14396.90 M: 8192, N: 768, K: 3072, trans_a: False, trans_b: True \| 122981.023 \| 7.725 \| 15918.34 Batch: 768, M: 128, N: 64, K: 128 \| 2032.523 \| 0.705 \| 2882.23 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99498 Approved by: https://github.com/jgong5, https://github.com/malfet	2023-09-28 01:03:50 +00:00
Edward Z. Yang	869226bf94	Avoid passing generator to parametrize (#110104 ) Fixes ``` ValueError: <function TestMeta.test_layer_norm_backward at 0x7f555f56e440>: An empty arg_values was passed to @parametrize. Note that this may result from reuse of a generator. ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/110104 Approved by: https://github.com/malfet, https://github.com/jbschlosser, https://github.com/voznesenskym	2023-09-27 02:52:48 +00:00
Mwiza Kunda	5c4b5baf21	Fix python decomps for OpOverloadPackets and add tests (#107707 ) - Extend `test_torch_dispatch_meta_outplace` to test torch ops that do not have an out parameter but have aten op overloads that have out parameters. Additionally, Python decompositions may register `OpOverloadPacket`'s so decompositions need to be tested to ensure all `OpOverloads` still function for the `Meta` key (e.g. if a python decomposition is registered for an aten op `aten.foo` with overloads `[default, out]`, the python function needs to support receiving out arguments) - Add out parameter wrappers to python decomps for aten ops that have out overloads CC. @ezyang @albanD @lezcano Fixes #107713 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107707 Approved by: https://github.com/lezcano	2023-09-25 20:53:30 +00:00
Mwiza Kunda	8dedc9dd9b	Add meta tests for layer/group/batch norm backward (#109591 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/109591 Approved by: https://github.com/ezyang	2023-09-21 18:58:51 +00:00
CaoE	54c28c564f	add Half support for BatchNorm on CPU (#102070 ) Fixes #106543 ### Testing Single core: shape \| fp32 forward / ms \| fp16 forward / ms \| bf16 forward / ms \| fp32 backward / ms \| fp16 backward / ms \| bf16 backward / ms -- \| -- \| -- \| -- \| -- \| -- \| -- (1, 4, 256, 256) \| 0.7116 \| 0.1427 \| 0.1744 \| 0.2638 \| 0.2002 \| 0.2556 (1, 32, 100, 100) \| 0.8579 \| 0.1725 \| 0.2077 \| 0.3023 \| 0.2399 \| 0.2995 (32, 16, 200, 200) \| 57.3466 \| 12.2179 \| 13.1320 \| 45.9524 \| 24.1526 \| 24.9882 28 cores: shape \| fp32 forward / ms \| fp16 forward / ms \| bf16 forward / ms \| fp32 backward / ms \| fp16 backward / ms \| bf16 backward / ms -- \| -- \| -- \| -- \| -- \| -- \| -- (1, 4, 256, 256) \| 0.2571 \| 0.0713 \| 0.0846 \| 0.1140 \| 0.0883 \| 0.1043 (1, 32, 100, 100) \| 0.1077 \| 0.0510 \| 0.0548 \| 0.0700 \| 0.0645 \| 0.0713 (32, 16, 200, 200) \| 5.5060 \| 1.4195 \| 1.4663 \| 6.773 \| 3.0886 \| 3.1343 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102070 Approved by: https://github.com/jgong5, https://github.com/mikaylagawarecki, https://github.com/mingfeima	2023-09-19 10:43:33 +00:00
Jez Ng	7f3885137f	Add meta function for _segment_reduce (#109359 ) This fixes numerous tests which were xfailing. For instance, the `_segment_reduce.lengths` OpInfo test, which was previously relying on the fallback kernel to determine the shape of the meta tensor. The fallback kernel would fail with segment_reduce(): Expected all rows of lengths along axis to sum to data.size(lengths.dim()-1) when !unsafe. as it was trying to read the values of a meta tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109359 Approved by: https://github.com/ezyang	2023-09-16 13:31:03 +00:00
PyTorch MergeBot	b226373d16	Revert "add Half support for BatchNorm on CPU (#102070 )" This reverts commit `b6a1d3fb97`. Reverted https://github.com/pytorch/pytorch/pull/102070 on behalf of https://github.com/clee2000 due to I'm very sorry but it looks like #106543 was not fixed, I still see it failing on main `b6a1d3fb97` https://github.com/pytorch/pytorch/actions/runs/6185704949/job/16793975677 ([comment](https://github.com/pytorch/pytorch/pull/102070#issuecomment-1719747065))	2023-09-14 16:13:34 +00:00
CaoE	b6a1d3fb97	add Half support for BatchNorm on CPU (#102070 ) Fixes #106543 ### Testing Single core: shape \| fp32 forward / ms \| fp16 forward / ms \| bf16 forward / ms \| fp32 backward / ms \| fp16 backward / ms \| bf16 backward / ms -- \| -- \| -- \| -- \| -- \| -- \| -- (1, 4, 256, 256) \| 0.7116 \| 0.1427 \| 0.1744 \| 0.2638 \| 0.2002 \| 0.2556 (1, 32, 100, 100) \| 0.8579 \| 0.1725 \| 0.2077 \| 0.3023 \| 0.2399 \| 0.2995 (32, 16, 200, 200) \| 57.3466 \| 12.2179 \| 13.1320 \| 45.9524 \| 24.1526 \| 24.9882 28 cores: shape \| fp32 forward / ms \| fp16 forward / ms \| bf16 forward / ms \| fp32 backward / ms \| fp16 backward / ms \| bf16 backward / ms -- \| -- \| -- \| -- \| -- \| -- \| -- (1, 4, 256, 256) \| 0.2571 \| 0.0713 \| 0.0846 \| 0.1140 \| 0.0883 \| 0.1043 (1, 32, 100, 100) \| 0.1077 \| 0.0510 \| 0.0548 \| 0.0700 \| 0.0645 \| 0.0713 (32, 16, 200, 200) \| 5.5060 \| 1.4195 \| 1.4663 \| 6.773 \| 3.0886 \| 3.1343 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102070 Approved by: https://github.com/jgong5, https://github.com/mikaylagawarecki	2023-09-14 12:23:59 +00:00
PyTorch MergeBot	04a765f95d	Revert "add Half support for BatchNorm on CPU (#102070 )" This reverts commit `6065e7a97c`. Reverted https://github.com/pytorch/pytorch/pull/102070 on behalf of https://github.com/clee2000 due to sorry it looks like this is causing an unexpected success for `test_jit_fuser_te.py::TestNNCOpInfoCPU::test_nnc_correctness_nn_functional_batch_norm_cpu_float16` `6065e7a97c` https://github.com/pytorch/pytorch/actions/runs/6178069462/job/16770849782 ([comment](https://github.com/pytorch/pytorch/pull/102070#issuecomment-1718402208))	2023-09-13 22:38:42 +00:00
CaoE	6065e7a97c	add Half support for BatchNorm on CPU (#102070 ) Fixes #106543 ### Testing Single core: shape \| fp32 forward / ms \| fp16 forward / ms \| bf16 forward / ms \| fp32 backward / ms \| fp16 backward / ms \| bf16 backward / ms -- \| -- \| -- \| -- \| -- \| -- \| -- (1, 4, 256, 256) \| 0.7116 \| 0.1427 \| 0.1744 \| 0.2638 \| 0.2002 \| 0.2556 (1, 32, 100, 100) \| 0.8579 \| 0.1725 \| 0.2077 \| 0.3023 \| 0.2399 \| 0.2995 (32, 16, 200, 200) \| 57.3466 \| 12.2179 \| 13.1320 \| 45.9524 \| 24.1526 \| 24.9882 28 cores: shape \| fp32 forward / ms \| fp16 forward / ms \| bf16 forward / ms \| fp32 backward / ms \| fp16 backward / ms \| bf16 backward / ms -- \| -- \| -- \| -- \| -- \| -- \| -- (1, 4, 256, 256) \| 0.2571 \| 0.0713 \| 0.0846 \| 0.1140 \| 0.0883 \| 0.1043 (1, 32, 100, 100) \| 0.1077 \| 0.0510 \| 0.0548 \| 0.0700 \| 0.0645 \| 0.0713 (32, 16, 200, 200) \| 5.5060 \| 1.4195 \| 1.4663 \| 6.773 \| 3.0886 \| 3.1343 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102070 Approved by: https://github.com/jgong5, https://github.com/mikaylagawarecki	2023-09-13 17:30:16 +00:00
Huamin Li	9fa5283401	[dynamo+aten] Enable embedding_bag_byte_unpack + meta kernel impl (#107937 ) Summary: ``` torch._dynamo.exc.Unsupported: unsupported operator: quantized.embedding_bag_byte_unpack.default ``` Differential Revision: D48652953 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107937 Approved by: https://github.com/houseroad	2023-08-26 08:52:42 +00:00
Jordan Fix	df16b1ed53	[dynamo+aten] Enable embedding_bag_byte_rowwise_offsets + meta kernel impl (#106105 ) Differential Revision: D47007550 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106105 Approved by: https://github.com/gmagogsfm	2023-08-21 16:33:42 +00:00
Sam Larsen	e165938853	Implement decomposition for aten.rrelu_with_noise (#106812 ) Test Plan: * Primarily, added new test in test/test_decomp.py * Updated existing tests, e.g., to NOT expect failure Pull Request resolved: https://github.com/pytorch/pytorch/pull/106812 Approved by: https://github.com/eellison	2023-08-11 19:18:29 +00:00
Nikita Karetnikov	12041d8e1f	Use default dispatch table for `tensordot.out` (#106669 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106669 Approved by: https://github.com/ezyang	2023-08-07 00:58:17 +00:00
Nikita Karetnikov	05e1a50723	[pt2] remove meta skips for `aminmax`, decomp exists (#106670 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106670 Approved by: https://github.com/ezyang	2023-08-07 00:55:25 +00:00
Nikita Karetnikov	19621a73c0	[pt2] add metas for `grid_sampler_3d` ops (#106261 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106261 Approved by: https://github.com/ezyang	2023-08-05 14:48:11 +00:00
Kshiteej K	a899333ffc	fix: nll_loss batch rule with negative ignore_idx (#106118 ) We use python decompositions instead of writing our own for batching rules. Fixes https://github.com/pytorch/pytorch/issues/105736 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106118 Approved by: https://github.com/lezcano, https://github.com/zou3519	2023-08-04 07:43:02 +00:00
Nikita Karetnikov	1f734e03df	[pt2] add metas for `mode` ops (#106273 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106273 Approved by: https://github.com/ezyang ghstack dependencies: #106272	2023-08-03 13:11:10 +00:00
Nikita Karetnikov	70469e6f04	[pt2] add metas for `median` ops (#106272 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106272 Approved by: https://github.com/ezyang	2023-08-03 13:11:10 +00:00
Nikita Karetnikov	f23d755e1f	[pt2] add meta for `ormqr` (#106278 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106278 Approved by: https://github.com/ezyang	2023-08-01 06:47:48 +00:00
Nikita Karetnikov	0ee3b84021	[pt2] add meta for `cholesky_inverse` (#106120 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106120 Approved by: https://github.com/ezyang	2023-07-29 17:16:20 +00:00
Nikita Karetnikov	80755884be	[pt2] add meta for `cholesky` (#106115 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106115 Approved by: https://github.com/Skylion007, https://github.com/ezyang	2023-07-29 17:16:20 +00:00
Nikita Karetnikov	a4cffaae67	[pt2] add metas for `_cholesky_solve_helper` and `cholesky_solve` (#105867 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105867 Approved by: https://github.com/ezyang	2023-07-25 20:21:47 +00:00
Justin Chu	73e1455327	[BE] Enable ruff's UP rules and autoformat test/ (#105434 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434 Approved by: https://github.com/albanD	2023-07-19 20:36:06 +00:00
Nikita Karetnikov	c00dd43e43	[pt2] add metas for `multilabel_margin_loss` ops (#104388 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104388 Approved by: https://github.com/ezyang	2023-07-05 13:42:22 +00:00
Nikita Karetnikov	a3aa4da154	[pt2] add metas for `multi_margin_loss` ops (#104236 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104236 Approved by: https://github.com/ezyang	2023-07-05 13:40:05 +00:00
Nikita Karetnikov	b1c31b1d26	[pt2] metas and `SymInt` support for `max_pool` ops (#103951 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103951 Approved by: https://github.com/Chillee, https://github.com/kulinseth	2023-07-01 01:33:35 +00:00
Nikita Karetnikov	c4a6f86062	[pt2] add metas for `max_unpool2d` and `max_unpool3d` (#103821 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103821 Approved by: https://github.com/Skylion007, https://github.com/Chillee	2023-07-01 01:33:35 +00:00
Nikita Karetnikov	e9705c52ac	[pt2] add metas for `_pdist_forward` and `_pdist_backward` (#103817 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103817 Approved by: https://github.com/ezyang	2023-06-22 11:18:05 +00:00
Aleksandar Samardžić	09fdea8564	Fix autograd issue with identity conversions (#92022 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92022 Approved by: https://github.com/pearu, https://github.com/mtaaooby, https://github.com/amjames, https://github.com/cpuhrsch	2023-06-21 21:23:03 +00:00
Nikita Karetnikov	2b3d955ffd	[pt2] add meta and `SymInt` support for `linalg_matrix_exp` (#102945 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102945 Approved by: https://github.com/lezcano	2023-06-09 22:45:16 +00:00
Edward Z. Yang	96fd283640	Preserve CreationMeta when metafying views. (#103152 ) This helps us avoid erroring / generate more accurate error messages in Dynamo when doing mutations on views. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103152 Approved by: https://github.com/albanD	2023-06-09 12:34:54 +00:00
Nikita Karetnikov	757791d1e3	[pt2] add `SymInt` support for `linalg.vander` (#102469 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102469 Approved by: https://github.com/Skylion007, https://github.com/lezcano	2023-06-04 09:58:02 +00:00
PyTorch MergeBot	463df86ce8	Revert "[pt2] add `SymInt` support for `linalg.vander` (#102469 )" This reverts commit `05717895aa`. Reverted https://github.com/pytorch/pytorch/pull/102469 on behalf of https://github.com/clee2000 due to broke test_aotdispatch on linux ex `05717895aa` https://github.com/pytorch/pytorch/actions/runs/5125654882/jobs/9219389448, shows up as green on pr due to bug with keep-going flag and reruns ([comment](https://github.com/pytorch/pytorch/pull/102469#issuecomment-1569041604))	2023-05-30 20:24:26 +00:00
Nikita Karetnikov	05717895aa	[pt2] add `SymInt` support for `linalg.vander` (#102469 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102469 Approved by: https://github.com/Skylion007, https://github.com/lezcano	2023-05-30 19:50:16 +00:00
Nikita Karetnikov	995ac703cd	[pt2] add `SymInt` support for `linalg.pinv` (#102367 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102367 Approved by: https://github.com/lezcano	2023-05-27 11:10:47 +00:00
PyTorch MergeBot	da3aba1e46	Revert "[pt2] add `SymInt` support for `linalg.pinv` (#102367 )" This reverts commit `0d5b74da0c`. Reverted https://github.com/pytorch/pytorch/pull/102367 on behalf of https://github.com/kit1980 due to Broke slow tests https://github.com/pytorch/pytorch/actions/runs/5095190248/jobs/9160028124 ([comment](https://github.com/pytorch/pytorch/pull/102367#issuecomment-1565104562))	2023-05-27 00:33:42 +00:00
Nikita Karetnikov	0d5b74da0c	[pt2] add `SymInt` support for `linalg.pinv` (#102367 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102367 Approved by: https://github.com/lezcano	2023-05-26 15:20:34 +00:00
Nikita Karetnikov	e79d9b9938	[pt2] add `SymInt` support for `linalg.matrix_power` (#101940 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101940 Approved by: https://github.com/lezcano, https://github.com/ezyang	2023-05-24 00:21:52 +00:00
Nikita Karetnikov	42b974e8f7	[pt2] add meta for `linalg_lu_solve` (#101836 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101836 Approved by: https://github.com/lezcano	2023-05-24 00:21:50 +00:00
drisspg	6f13d6892a	Add meta support for multinomial (#101324 ) # Summary Found this when trying to compile the text gen loop of nanogpt here: `b33289942b/torchbenchmark/models/nanogpt_generate/model.py (L322)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/101324 Approved by: https://github.com/ngimel	2023-05-19 00:04:26 +00:00
Angela Yi	72a73ef67b	Add aten.searchsorted.Tensor meta kernel (#101637 ) Test Plan: CI Differential Revision: D45933187 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101637 Approved by: https://github.com/ezyang	2023-05-18 06:55:11 +00:00
kshitij12345	afea1a9fe9	[meta] error checking for inplace ops (#101532 ) Fixes #100753 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101532 Approved by: https://github.com/lezcano	2023-05-16 17:26:59 +00:00
Nikita Karetnikov	a8964d6377	[pt2] add meta and `SymInt` support for `linalg_householder_product` (#101315 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101315 Approved by: https://github.com/lezcano	2023-05-15 02:56:49 +00:00
Sun, Jiayi	d56e1b2f67	add Half support for unary ops on CPU (#98493 ) Add Half support for log_sigmoid and some unary ops on CPU, including sinc, acosh, asinh, atanh, digamma, trigamma, rsqrt, acos, asin, atan, ceil, cos, erf, erfc, erfinv, exp, expml, floor, log, log10, log1p, log2, i0, round, sin, sqrt, tan, tanh, trunc, lgamma. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98493 Approved by: https://github.com/jgong5, https://github.com/mingfeima, https://github.com/ngimel	2023-05-12 04:52:34 +00:00
Khushi	51fe53e619	[opinfo] item (#100313 ) Follows #100223 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100313 Approved by: https://github.com/ezyang	2023-05-10 11:32:45 +00:00
Nikita Karetnikov	1e591a8b64	[pt2] add meta function for `solve_triangular` (#100829 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100829 Approved by: https://github.com/ezyang	2023-05-08 13:48:15 +00:00
Nikita Karetnikov	e87ed2a88d	[primTorch] add ref for `polar` (#100345 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100345 Approved by: https://github.com/ezyang	2023-05-04 01:37:02 +00:00
Richard Zou	4135295a76	Excise yaml dependency in torchgen.model (#100203 ) The problem: - The new CustomOp API depends on torchgen.model - torchgen.model imports `yaml` - `yaml` is not a PyTorch runtime dependency To unblock myself, because I'm not sure how long it'll take to convince people yaml should be a PyTorch runtime dependency (unless one of you wants to approve #100166), this PR removes the yaml dependency from torchgen.model. It does so by splitting torchgen.utils (the offender) into torchgen.utils (no yaml) and torchgen.yaml (which uses yaml). Test Plan: - CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/100203 Approved by: https://github.com/ezyang, https://github.com/Skylion007	2023-04-28 13:45:39 +00:00
Jiong Gong	e5c9a0fcf5	[dynamo] avoid graph break on repeat_interleave.self_int (#99528 ) Address convit_base failure: https://github.com/pytorch/torchdynamo/issues/1886 mentioned in https://github.com/pytorch/pytorch/issues/93777 Also for models like EleutherAI/gpt-j-6B. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99528 Approved by: https://github.com/ezyang	2023-04-25 04:47:39 +00:00
Peter Bell	7b91bd2a7b	[primTorch] Add count_nonzero (#98995 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98995 Approved by: https://github.com/lezcano	2023-04-13 22:08:19 +00:00
Nikita Karetnikov	8db04e080c	[pt2] add `SymInt` support for `cdist` (#98881 ) Fixes #98853. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98881 Approved by: https://github.com/ezyang	2023-04-12 23:06:40 +00:00
Nikita Karetnikov	ff825de442	[primTorch] add ref for `cumprod` (#98670 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98670 Approved by: https://github.com/ezyang	2023-04-09 15:22:28 +00:00
Nikita Karetnikov	b411238d76	[pt2] add meta function for `logcumsumexp` (#98683 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98683 Approved by: https://github.com/ezyang	2023-04-09 01:26:37 +00:00
Nikita Karetnikov	1c226f5aad	[pt2] add meta functions for `cummax` and `cummin` (#98552 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98552 Approved by: https://github.com/Chillee	2023-04-07 17:58:28 +00:00
albanD	0210481dcb	Fix _like meta registrations (#98160 ) The meta implementation for these _like function is wrong whenever device != "meta" (it doesn't fill the memory!). zeros_like is special due to sparse and is fixed directly by always filling it with zeros. Every other one is CompositeExplicit implementation, I went with removing their meta registration and tweaking code to avoid infinite recursions. I can do the same as zeros_like (and add the proper filling for each) but that would duplicate the c++ logic and make the meta registrations non trivial. I can do it if you prefer to removal. test_meta works fine with these fixes, relying on CI to see if other tests are breaking as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98160 Approved by: https://github.com/ezyang	2023-04-06 18:44:34 +00:00
Nikita Karetnikov	7b25976323	[pt2] add meta function for `take` (#98451 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98451 Approved by: https://github.com/ezyang	2023-04-06 14:48:35 +00:00
Wonjoo Lee	3095c95828	Fixes for PyTorch/XLA functionalization integration (#94537 ) Fixes for PyTorch/XLA functionalization integration --- Some notable changes include: - More asserts in `FunctionalTensorWrapper`, so bugs show up more cleanly in cases where we e.g. forget to wrap an output - Make the *_scatter ops `CompositeExplicitAutogradNonFunctional`, so we get a better error message and XLA doesn't accidentally try to us them - Fix LTC/XLA codegen in core to handle multi-tensor out= ops with no returns - Better erroring: Allow XLA to use the CPU fallback from core in a way so that it always errors on view ops, which XLA should no longer see. - Update MetaConverter to exclude XLA tensors in raising NotImplemented… - Add `_propagate_xla_data` op - Add meta tensor support for some ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/94537 Approved by: https://github.com/bdhirsh	2023-03-02 23:02:34 +00:00
Khushi	a0389681c2	[complex] nansum & nanmean (#93199 ) Follows: #71472 Pull Request resolved: https://github.com/pytorch/pytorch/pull/93199 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/kshitij12345	2023-02-16 06:13:42 +00:00
Zheng Yan	753c33bf86	Enable half type support for unique cpu (#91666 ) Test Plan: CI Differential Revision: D42326527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91666 Approved by: https://github.com/jgong5, https://github.com/ngimel	2023-02-16 04:59:35 +00:00
haozhe.zhu	ed54a5d06b	enable bf16 emb (#94163 ) Merge https://github.com/pytorch/pytorch/pull/89199 and https://github.com/pytorch/pytorch/pull/91949 into one PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94163 Approved by: https://github.com/jianyuh, https://github.com/malfet, https://github.com/jgong5	2023-02-12 00:05:09 +00:00
Brian Hirsh	948cd61afc	add fallthrough kernel for AutogradMeta key (#94603 ) The other `Autograd[Backend]` keys all have fallthrough kernels registered to them, but `AutogradMeta` was missing the fallthrough kernel. This is a problem for custom ops that don't have autograd support, if you try to run them with meta tensors. If you have a custom op, and register a CPU and a Meta kernel, then: (1) if you run the op with cpu tensors, it will dispatch straight to the CPU kernel (as expected) (2) if you run the op with meta tensors, you will error - because we don't have a fallthrough registered to the AutogradMeta key, we will try to dispatch to the AutogradMeta key and error, since the op author hasn't provided an autograd implementation. Here's a repro that I confirmed now works: ``` import torch from torch._dispatch.python import enable_python_dispatcher from torch._subclasses.fake_tensor import FakeTensorMode lib = torch.library.Library("test", "DEF") impl_cpu = torch.library.Library("test", "IMPL", "CPU") impl_meta = torch.library.Library("test", "IMPL", "Meta") def foo_impl(x): return x + 1 lib.define("foo(Tensor a) -> Tensor") impl_meta.impl("foo", foo_impl) impl_cpu.impl("foo", foo_impl) with enable_python_dispatcher(): a = torch.ones(2, device='meta') print("@@@@@") b = torch.ops.test.foo.default(a) print(b) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94603 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-10 22:44:52 +00:00
Aaron Gokaslan	748bac8757	[BE]: Apply pyupgrade yield from and unit test alias upgrades (#94309 ) Applies some more harmless pyupgrades. This one gets rid of deprecated aliases in unit_tests and more upgrades yield for loops into yield from generators which are more performance and propagates more information / exceptions from original generator. This is the modern recommended way of forwarding generators. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94309 Approved by: https://github.com/albanD	2023-02-07 20:08:58 +00:00
PyTorch MergeBot	53e4fe076a	Revert "enable bf16 emb (#94163 )" This reverts commit `f3bf46e801`. Reverted https://github.com/pytorch/pytorch/pull/94163 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. But I suspect that it causes flaky SIGSEGV failure for linux-bionic-py3.8-clang9 / test (crossref) job in trunk. For example, `05397b1250`	2023-02-07 00:32:22 +00:00
albanD	496c0a207b	Make segment_reduce properly private. (#93166 ) I am attempting not to change the aten function to reduce the amount of BC issues on the torchscript side. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93166 Approved by: https://github.com/ngimel	2023-02-06 18:32:23 +00:00
haozhe.zhu	f3bf46e801	enable bf16 emb (#94163 ) Merge https://github.com/pytorch/pytorch/pull/89199 and https://github.com/pytorch/pytorch/pull/91949 into one PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94163 Approved by: https://github.com/jianyuh, https://github.com/malfet, https://github.com/jgong5	2023-02-06 07:11:40 +00:00
Michael Suo	4e4293f15f	Add meta registration for bucketize (#93893 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93893 Approved by: https://github.com/zhxchen17	2023-02-02 21:03:08 +00:00
Ivan Yashchuk	fba13d94a1	Remove deprecated torch.symeig (#70988 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`. - [x] XLA PR: https://github.com/pytorch/xla/pull/4498 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988 Approved by: https://github.com/lezcano, https://github.com/kit1980, https://github.com/malfet	2023-01-31 11:59:11 +00:00
mfkasim1	75cfc0be21	Logcumsumexp for CPU (#93153 ) Partial work from #90847, in the direction of solving #89205. Most of the content is from #90847, but this is only for CPU, so hopefully it does not increase the build time by a lot. tag: @albanD, @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/93153 Approved by: https://github.com/malfet, https://github.com/Skylion007	2023-01-27 22:29:33 +00:00
PyTorch MergeBot	9b23fd378f	Revert "Logcumsumexp for complex in CPU and CUDA (#90847 )" This reverts commit `64985123e4`. Reverted https://github.com/pytorch/pytorch/pull/90847 on behalf of https://github.com/malfet due to Reverting to decrease build time, let's discuss the alternatives here	2023-01-24 20:49:08 +00:00
PyTorch MergeBot	acdd462b1a	Revert "Remove deprecated torch.symeig (#70988 )" This reverts commit `d70ed68162`. Reverted https://github.com/pytorch/pytorch/pull/70988 on behalf of https://github.com/kit1980 due to Failing XLA tests, forward fix unsuccessful	2023-01-24 19:03:40 +00:00
Ivan Yashchuk	d70ed68162	Remove deprecated torch.symeig (#70988 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988 Approved by: https://github.com/lezcano, https://github.com/kit1980	2023-01-23 22:51:40 +00:00
mfkasim1	64985123e4	Logcumsumexp for complex in CPU and CUDA (#90847 ) Another PR towards solving #89205. What's in this PR: * The implementation of forward `logcumsumexp` for complex numbers in CPU & CUDA * The tests on forward call of `logcumsumexp` for complex numbers * The implementation of backward `logcumsumexp` for complex numbers What's missing: * The test on backward gradient of `logcumsumexp` (it complaints `RuntimeError: logcumsumexp does not support automatic differentiation for outputs with complex dtype.` and I don't know how to solve the error and I don't know where to put the test for the backward computation). If possible, I'd like this to be done in this PR. It's really tricky to handle the edge cases here (i.e. the ones involving `inf`), but I've tried my best to put some comments explaining the reasonings of my decisions in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90847 Approved by: https://github.com/albanD	2023-01-20 15:10:50 +00:00
lezcano	138a0188e0	Add support for logaddexp(float16) in CUDA and implement its reference (#91869 ) The reference is implemented so that it generates efficient and numerically stable triton code. Fixes https://github.com/pytorch/pytorch/issues/91683 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91869 Approved by: https://github.com/ngimel	2023-01-10 00:19:24 +00:00
Peter Bell	ad7aefb608	Fix Meta tests for FFT functions (#91628 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91628 Approved by: https://github.com/kit1980	2023-01-05 00:58:26 +00:00
Edward Z. Yang	e686a442b4	If a torch.* returns non-Tensor, make this unimplemented rather than assert. (#89918 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89918 Approved by: https://github.com/albanD	2022-12-15 21:53:54 +00:00
Natalia Gimelshein	bc93454e4a	correctly set strides for expanded/unsqueezed dimensions (#90341 ) Fixes https://github.com/pytorch/torchdynamo/issues/1959, #90260 However, I wasn't able to make existing stride tests fail before the fix, even though I'm comparing all, not just significant strides. Separately running refs on meta tensors produces wrong strides as shown in #90260, however, it looks like in meta tests some other way of computing meta info is used (I've been running ``` pytest -s -v test/test_meta.py -k test_meta_outplace_expand_cuda_float64 ``` and verified that it has sample input that should fail, and that it indeed compares all the strides, but the produced `meta_rs` results somehow still had correct strides). Edit: @SherlockNoMad helped me figure out how to fail the tests, and now I've set the correct ops for checking. `expand` fails for some test inputs because it special-cases 0-dim input case, correctly modeling it in prims would require a lot of changes, so skipping that for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90341 Approved by: https://github.com/SherlockNoMad	2022-12-07 23:38:33 +00:00
Ram Rachum	351d73b97f	Fix exception causes all over the codebase (#90271 ) This is the continuation to #90134 and hopefully the final PR in this series. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271 Approved by: https://github.com/kit1980	2022-12-07 04:29:00 +00:00
Yanbo Liang	e1532af0bb	Fix meta registration for aten._cdist_forward (#90042 ) Error from [7k github model](https://github.com/pytorch/torchdynamo/issues/1884). Pull Request resolved: https://github.com/pytorch/pytorch/pull/90042 Approved by: https://github.com/ezyang, https://github.com/eellison	2022-12-02 21:13:52 +00:00
Jane Xu	8695f0cced	Rectify `native_batch_norm` schema by splitting it into two legit schemas (#88697 ) Using the same repro from the issue (but with BatchNorm2D) Rectifies native_batch_norm schema by splitting the schema into 2: 1. one will have NON-optional alias-able running_mean and running_var inputs 2. the other will just not have those parameters at all (no_stats variation) Calling for name suggestions! ## test plan I've added tests in test_functionalization.py as well as an entry in common_method_invocations.py for `native_batch_norm_legit` CI should pass. ## next steps Because of bc/fc reasons, we reroute native_batch_norm to call our new schemas ONLY through the python dispatcher, but in 2 weeks or so, we should make `native_batch_norm_legit` the official batch_norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88697 Approved by: https://github.com/albanD	2022-11-23 23:23:17 +00:00
Driss Guessous	1d9e1fca97	Update sdp dispatch logic to enable fused backward (#89154 ) # Summary Reorganizes how the sdp dispatch logic is down in order to enable backwards for fused kernels Pull Request resolved: https://github.com/pytorch/pytorch/pull/89154 Approved by: https://github.com/cpuhrsch	2022-11-21 20:02:09 +00:00
PyTorch MergeBot	e1d58b1928	Revert "Update sdp dispatch logic to enable fused backward (#89154 )" This reverts commit `2e72ec7982`. Reverted https://github.com/pytorch/pytorch/pull/89154 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but the new test_sdp_math_gradcheck test breaks periodic slow gradcheck, i.e. `419ef2cdcf`	2022-11-20 22:14:38 +00:00
Driss Guessous	2e72ec7982	Update sdp dispatch logic to enable fused backward (#89154 ) # Summary Reorganizes how the sdp dispatch logic is down in order to enable backwards for fused kernels Pull Request resolved: https://github.com/pytorch/pytorch/pull/89154 Approved by: https://github.com/cpuhrsch	2022-11-19 02:06:27 +00:00
lezcano	154e58c032	Add most in-place references/decompositions (#88117 ) We add most in-place references in a generic way. We also implement a wrapper to implement the annoying interface that `nn.functional` nonlinearities have. We fix along the way a couple decompositions for some non-linearities by extending the arguments that the references have. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88117 Approved by: https://github.com/mruberry	2022-11-18 14:59:46 +00:00
Nikita Karetnikov	4270bb37da	[primTorch] Improve `narrow` and `narrow_copy`: refs, tests, docs (#87045 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87045 Approved by: https://github.com/mruberry	2022-11-12 15:03:50 +00:00
PyTorch MergeBot	93d3bd626e	Revert "[primTorch] Improve `narrow` and `narrow_copy`: refs, tests, docs (#87045 )" This reverts commit `aa8279bcb8`. Reverted https://github.com/pytorch/pytorch/pull/87045 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41161182	2022-11-09 20:48:32 +00:00
Nikita Karetnikov	aa8279bcb8	[primTorch] Improve `narrow` and `narrow_copy`: refs, tests, docs (#87045 ) Fixes #87019. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87045 Approved by: https://github.com/mruberry	2022-11-09 09:19:28 +00:00
Sherlock Huang	46730aec35	[Reland] Fix primTorch compute_elementwise_output_strides (#88525 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88525 Approved by: https://github.com/desertfire	2022-11-05 05:42:07 +00:00
PyTorch MergeBot	2b117c8436	Revert "Fix primTorch compute_elementwise_output_strides (#88175 )" This reverts commit `1c8a0656d6`. Reverted https://github.com/pytorch/pytorch/pull/88175 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but it breaks cuda 11.6 in trunk. As the PR signal was green, this is probably a landrace	2022-11-03 16:53:04 +00:00
Sherlock Huang	1c8a0656d6	Fix primTorch compute_elementwise_output_strides (#88175 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88175 Approved by: https://github.com/ngimel	2022-11-03 08:38:55 +00:00
soulitzer	4c20c0509d	Split out forward AD tests from test_ops_gradients and reenable slow gradcheck CI (#88216 ) Fixes: https://github.com/pytorch/pytorch/issues/88010 This PR does a couple things to stop slow gradcheck from timing out: - Splits out test_ops_fwd_gradients from test_ops_gradients, and factors out TestFwdGradients and TestBwdGradients which both inherit from TestGradients, now situated in common_utils (maybe there is a better place?) - Skips CompositeCompliance (and several other test files) for slow gradcheck CI since they do not use gradcheck - because test times for test_ops_fwd_gradients and test_ops_gradients are either unknown or wrong, we hardcode them for now to prevent them from being put together. We can undo the hack after we see actual test times are updated. ("def calculate_shards" randomly divides tests with unknown test times in a round-robin fashion.) - Updates references to test_ops_gradients and TestGradients - Test files that are skipped for slow gradcheck CI are now centrally located in in run_tests.py, this reduces how fine-grained we can be with the skips, so for some skips (one so far) we still use the old skipping mechanism, e.g. for test_mps Pull Request resolved: https://github.com/pytorch/pytorch/pull/88216 Approved by: https://github.com/albanD	2022-11-03 00:20:45 +00:00
Sherlock Huang	c00c34fb69	Fix meta for aten.upsample_bilinear2d.vec (#88158 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88158 Approved by: https://github.com/ngimel	2022-11-02 16:58:29 +00:00
lezcano	39d9d2ed70	Implement reference for lerp (#87424 ) We follow the vectorised CPU implementation for numerical accuracy Pull Request resolved: https://github.com/pytorch/pytorch/pull/87424 Approved by: https://github.com/ezyang	2022-11-02 11:21:01 +00:00
Sherlock Huang	de1f641f11	Fix meta function for aten.addmm (#88068 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88068 Approved by: https://github.com/albanD	2022-11-01 17:05:48 +00:00
Sherlock Huang	c368c0faf0	Fix meta for aten.fill, constant_pad_nd, _adaptive_avg_pool2d (#88069 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88069 Approved by: https://github.com/ngimel, https://github.com/malfet	2022-11-01 15:36:06 +00:00
Sherlock Huang	c7ac333430	Fix args for meta__fused_moving_avg_obs_fq_helper (#88058 ) Fixes https://github.com/pytorch/torchdynamo/issues/1802 There are a few problems, 1. torch.fused_moving_avg_obs_fake_quant doesn't have OpInfo test 2. self.empty_like() is not a valid call. it should be torch.empty_like(self) 3. python meta function has some unexplained behavior for arguments with default value of bool type? In particular, problem 3 is the most concerning one. UPDATE: This is expected behavior, see discussion below for explanation. Without setting the default value for `per_row_fake_quant` and `symmetric_quant`, it gets the following error when running with meta tensor. ``` meta__fused_moving_avg_obs_fq_helper() missing 2 required positional arguments: 'per_row_fake_quant' and 'symmetric_quant' ``` I can fix this by adding the default values to these two args. However, I observer something strange when examining the actual value in meta function. ``` print("per_row_fake_quant", per_row_fake_quant) print("symmetric_quant", symmetric_quant) ``` When default values are False, printed value correctly reflect the args value populated from call site. When default values are True, printed value is ALWAYS True, regardless of the populated value from call site. When default Values are None, printed value is `None` when call site set the value to 'False', printed value is 'True' when call site sets the value to 'True'. I also verify that this bug also affect for other meta function with default args.... My speculation is that this is something about pybind value packing when called from c++ dispatcher to python meta function, and default value parsing for python meta function (and other python dispatch functions) ? I tried to find the c++ call stack, but gdb is missing symbols and C++ stacktrace is not working properly... Appreciate anyone who can point me to the source file for pybind value packing. cc @ezyang cc @bdhirsh. I know you had a fix in the symbolic shape branch... cc @yanboliang who reported this bug Pull Request resolved: https://github.com/pytorch/pytorch/pull/88058 Approved by: https://github.com/bdhirsh, https://github.com/yanboliang	2022-10-31 19:00:16 +00:00
Edward Z. Yang	ff94494644	Revert "Revert "Unify meta tensor and fake tensor converter conversion (#87943 )"" (#88045 ) This reverts commit `bc64999b83`. Check torch/_subclasses/meta_utils.py for "This is very tricky" for the bugfix explanation. cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx Pull Request resolved: https://github.com/pytorch/pytorch/pull/88045 Approved by: https://github.com/kit1980, https://github.com/Chillee	2022-10-31 17:50:14 +00:00
Sherlock Huang	0a4ca9d083	Fix meta for aten.angle and aten.index_copy (#88066 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88066 Approved by: https://github.com/albanD	2022-10-31 17:11:29 +00:00
Sherlock Huang	5723fd503c	Fix meta function for aten.flip and aten.rot90 (#88065 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88065 Approved by: https://github.com/mruberry	2022-10-31 16:52:05 +00:00
PyTorch MergeBot	bc64999b83	Revert "Unify meta tensor and fake tensor converter conversion (#87943 )" This reverts commit `baa715e790`. Reverted https://github.com/pytorch/pytorch/pull/87943 on behalf of https://github.com/kit1980 due to Broke several inductor tests	2022-10-29 18:39:28 +00:00

1 2 3 4 5 ...

275 Commits