pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Kurt Mohler	3908ebca86	Test COW materialization in backward ops (#123593 ) Part of #97856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123593 Approved by: https://github.com/ezyang	2024-04-09 22:31:50 +00:00
Pearu Peterson	d895192e87	Fix zeros_like on sparse compressed fake tensors (#123084 ) Fixes https://github.com/pytorch/pytorch/pull/117907#issuecomment-2025769663 Adds block compressed sparse tensors support to zeros_like Pull Request resolved: https://github.com/pytorch/pytorch/pull/123084 Approved by: https://github.com/amjames, https://github.com/peterbell10	2024-04-03 16:11:11 +00:00
Yakai Wang	4d5cdc2e1e	Fix empty_like bug for sparse tensors. (#121900 ) Fixes #121671 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121900 Approved by: https://github.com/pearu	2024-04-01 22:40:38 +00:00
Kurt Mohler	ca9606f809	Update COW OpInfo test to include kwargs and expected materialization (#122437 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122437 Approved by: https://github.com/ezyang	2024-03-24 06:07:30 +00:00
PyTorch MergeBot	c80601f35a	Revert "Avoid COW materialize in conv, log sigmoid, repeat, group_norm, batch_norm (#121537 )" This reverts commit `a2a88f39ee`. Reverted https://github.com/pytorch/pytorch/pull/121537 on behalf of https://github.com/kurtamohler due to flaky CI failures ([comment](https://github.com/pytorch/pytorch/pull/121537#issuecomment-2010937226))	2024-03-21 00:03:30 +00:00
Kurt Mohler	a2a88f39ee	Avoid COW materialize in conv, log sigmoid, repeat, group_norm, batch_norm (#121537 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121537 Approved by: https://github.com/ezyang	2024-03-19 06:15:00 +00:00
lezcano	8a5a377190	Move doc links to point to main (#121823 ) The previous links were pointing to an outdated branch Command: `find . -type f -exec sed -i "s:docs/main:docs/master:g" {} + ` Pull Request resolved: https://github.com/pytorch/pytorch/pull/121823 Approved by: https://github.com/albanD, https://github.com/malfet	2024-03-15 19:49:37 +00:00
blorange-amd	b27d76949b	[ROCm] Enable several fake_crossref UTs on ROCm (#121112 ) Enabled unit tests: test_ops::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_norm_subgradients_at_zero_cuda_float32 test_ops::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_norm_subgradients_at_zero_cuda_float32 test_ops::TestFakeTensorCUDA::test_fake_crossref_backward_amp_norm_nuc_cuda_float32 test_ops::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_norm_nuc_cuda_float32 test_ops::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_cuda_float32 test_ops::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_svd_cuda_float32 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121112 Approved by: https://github.com/ezyang	2024-03-06 17:36:47 +00:00
Kurt Mohler	77aea289ae	Add test to check that COW inputs are not materialized (#119507 ) Part of #97856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119507 Approved by: https://github.com/ezyang ghstack dependencies: #120455	2024-03-01 05:05:28 +00:00
PyTorch MergeBot	dbe0967a0a	Revert "Add test to check that COW inputs are not materialized (#119507 )" This reverts commit `2ebf2c88ba`. Reverted https://github.com/pytorch/pytorch/pull/119507 on behalf of https://github.com/izaitsevfb due to breaks xla jobs ([comment](https://github.com/pytorch/pytorch/pull/119507#issuecomment-1970022840))	2024-02-28 22:26:59 +00:00
Kurt Mohler	2ebf2c88ba	Add test to check that COW inputs are not materialized (#119507 ) Part of #97856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119507 Approved by: https://github.com/ezyang ghstack dependencies: #120455	2024-02-28 00:37:33 +00:00
lezcano	b97fa6ac30	Make roll a decomposition and remove its lowering (#119857 ) We use the fact that we now propagate indexing properly to avoid having to maintain two different implementations of the op. Doing this we also remove a spurious guard on this op. We move the ref into a decomp as we now use advanced indexing. The only difference we did in the implementation is that we now use advanced indexing rather than `torch.cat`. We also remove it from core. Let's see how this goes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119857 Approved by: https://github.com/peterbell10, https://github.com/larryliu0820 ghstack dependencies: #119863, #119864	2024-02-16 19:14:39 +00:00
Andrew M. James	4625ecb858	Add decomp for linalg.cross (#119809 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119809 Approved by: https://github.com/lezcano, https://github.com/peterbell10	2024-02-16 09:58:38 +00:00
blorange-amd	df9b44436a	[ROCm] Enable float16/complex32 fft tests on ROCm (#117296 ) This PR is to enable float16/complex32 fft tests on ROCm. Sample results are attached here: [test_spectral_ops_results.log](https://github.com/pytorch/pytorch/files/13908533/test_spectral_ops_results.log) test_decomp::TestDecompCUDA::test_comprehensive_fft* test_decomp::TestDecompCUDA::test_quick_fft* test_jit_fuser_te::TestNNCOpInfoCUDA::test_nnc_correctness_fft* test_meta::TestMetaCUDA::test_dispatch_meta_inplace_fft* test_meta::TestMetaCUDA::test_dispatch_meta_outplace_fft* test_meta::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft* test_meta::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft* test_meta::TestMetaCUDA::test_meta_inplace_fft* test_meta::TestMetaCUDA::test_meta_outplace_fft* test_ops::TestCommonCUDA::test_complex_half_reference_testing_fft* test_ops::TestCommonCUDA::test_python_ref__refs_fft* test_ops::TestCommonCUDA::test_python_ref_executor__refs_fft* test_ops::TestCommonCUDA::test_python_ref_meta__refs* test_ops::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft* test_schema_check::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft* test_spectral_ops::TestFFTCUDA::test_empty_fft__refs_fft* test_spectral_ops::TestFFTCUDA::test_empty_fft_fft* test_spectral_ops::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft* test_spectral_ops::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft* test_spectral_ops::TestFFTCUDA::test_fft_round_trip_cuda* test_spectral_ops::TestFFTCUDA::test_fft_type_promotion_cuda* test_spectral_ops::TestFFTCUDA::test_fftn_round_trip_cuda* test_spectral_ops::TestFFTCUDA::test_hfftn_cuda_float16 test_spectral_ops::TestFFTCUDA::test_ihfftn_cuda_float16 test_utils::TestDeviceUtilsCUDA::test_device_mode_ops_fft Pull Request resolved: https://github.com/pytorch/pytorch/pull/117296 Approved by: https://github.com/pruthvistony, https://github.com/malfet	2024-02-13 22:35:32 +00:00
Pearu Peterson	2c91e13afc	Add lowerings to special functions (#119187 ) As in the title. In addition, the PR introduces infrastructure for lowerings of pointwise functions that have both cpp and triton implementations available. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119187 Approved by: https://github.com/peterbell10	2024-02-11 16:35:40 +00:00
Edward Z. Yang	9bce208dfb	Replace follow_imports = silent with normal (#118414 ) This is a lot of files changed! Don't panic! Here's how it works: * Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file. * When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded. * The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors. * Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list. * Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves. * torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state. * There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many. In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file. The codemod was done with this script authored by GPT-4: ``` import glob exclude_patterns = [ ... ] for pattern in exclude_patterns: for filepath in glob.glob(pattern, recursive=True): if filepath.endswith('.py'): with open(filepath, 'r+') as f: content = f.read() f.seek(0, 0) f.write('# mypy: ignore-errors\n\n' + content) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414 Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD	2024-01-27 02:44:11 +00:00
Khushi Agrawal	5d2d21a7be	[bfloat16][easy] kthvalue, median (#117279 ) Fixes #109991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117279 Approved by: https://github.com/Skylion007	2024-01-11 22:44:07 +00:00
Pearu Peterson	4a37f57c69	Add batched sparse CSR/CSC/BSR/BSC to sparse COO conversion support (#116206 ) As in the title. Fixes https://github.com/pytorch/pytorch/issues/104868 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116206 Approved by: https://github.com/amjames, https://github.com/lezcano, https://github.com/cpuhrsch	2024-01-07 19:42:02 +00:00
Aaron Gokaslan	a7902571be	Add bfloat16 CUDA support to gamma unary functions (#116929 ) Add bfloat16 support to unary gamma functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116929 Approved by: https://github.com/malfet	2024-01-07 02:07:55 +00:00
Aaron Gokaslan	bd10fea79a	[BE]: Enable F821 and fix bugs (#116579 ) Fixes #112371 I tried to fix as many of the bugs as I could, a few I could not figure out what the proper fix for them was though and so I left them with noqas. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116579 Approved by: https://github.com/ezyang	2024-01-01 08:40:46 +00:00
wgb	71ec3edbf7	Enhance Opinfo to support privateuse1 (#116417 ) Fix Opinfo does not support third-party devices when the current test framework instantiation method is privateuse1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116417 Approved by: https://github.com/albanD	2023-12-29 13:43:29 +00:00
Isuru Fernando	1f1ff629a8	Use parent class attribute supports_out for foreach_zero opinfo (#112778 ) Instead of introducing a new has_no_out_of_place attribute Also fixes foreach_copy tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/112778 Approved by: https://github.com/lezcano	2023-11-22 18:00:44 +00:00
Joel Schlosser	afdc528520	Print the index and summary of the SampleInput that failed an OpInfo test (#99444 ) Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test. Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test. This solves the problem that the test framework currently has no concept of which sample input is being operated on. This PR contains the following changes: * New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure * The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput` * To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()` * Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well Example output when a sample input causes a failure: ``` ====================================================================== ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper return test(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn return fn(slf, args, *kwargs) File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo self.fail('Example failure') AssertionError: Example failure The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper method(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test result = test(self, param_kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper raise Exception( Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='') To execute this test, run the following from the base repo dir: python test/test_ops.py -k test_foo_add_cpu_uint8 This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ---------------------------------------------------------------------- ``` This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444 Approved by: https://github.com/janeyx99	2023-11-21 23:08:35 +00:00
PyTorch MergeBot	5f0d72124e	Revert "Print the index and summary of the SampleInput that failed an OpInfo test (#99444 )" This reverts commit `e7f12b1eb0`. Reverted https://github.com/pytorch/pytorch/pull/99444 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause memory leak on CUDA job `e7f12b1eb0` ([comment](https://github.com/pytorch/pytorch/pull/99444#issuecomment-1820491298))	2023-11-21 08:58:54 +00:00
Joel Schlosser	e7f12b1eb0	Print the index and summary of the SampleInput that failed an OpInfo test (#99444 ) Related to the Reproducible Testing BE project. Goal is to print out the sample input that failed an OpInfo test. Crazy idea: to avoid requiring widespread changes across tests that use OpInfo sample inputs, return a new special iterator type from `OpInfo.sample_inputs()`, etc. that tracks the most recent item seen. If a test fails later on, print out this info to identify the sample that failed the test. This solves the problem that the test framework currently has no concept of which sample input is being operated on. This PR contains the following changes: * New `TrackedInputIter` that wraps a sample inputs func iterator and tracks the most recent input seen in a `TrackedInput` structure * The information is stored in a dictionary on the test function itself, mapping `full test ID -> most recent TrackedInput` * To determine the test function that is being run, we do some stack crawling hackery in `extract_test_fn_and_id()` * Above applies only when one of the following is called: `OpInfo.sample_inputs()`, `OpInfo.error_inputs()`, `OpInfo.reference_inputs()`, and `OpInfo.conjugate_sample_inputs()`. This could easily be extended to `ModuleInfo`s and the sparse sample input funcs as well Example output when a sample input causes a failure: ``` ====================================================================== ERROR: test_foo_add_cpu_uint8 (__main__.TestFakeTensorCPU) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 911, in test_wrapper return test(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 1097, in only_fn return fn(slf, args, *kwargs) File "/home/jbschlosser/branches/reproducible_testing/test/test_ops.py", line 2211, in test_foo self.fail('Example failure') AssertionError: Example failure The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_utils.py", line 2436, in wrapper method(args, kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 414, in instantiated_test result = test(self, param_kwargs) File "/home/jbschlosser/branches/reproducible_testing/torch/testing/_internal/common_device_type.py", line 917, in test_wrapper raise Exception( Exception: Caused by sample input at index 2: SampleInput(input=Tensor[size=(5, 1), device="cpu", dtype=torch.uint8], args=TensorList[Tensor[size=(5,), device="cpu", dtype=torch.uint8]], kwargs={}, broadcasts_input=True, name='') To execute this test, run the following from the base repo dir: python test/test_ops.py -k test_foo_add_cpu_uint8 This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ---------------------------------------------------------------------- ``` This notably doesn't print the actual `SampleInput` values, as that's hard without fully reproducible random sample generation. I went down this path for a while and it seems infeasible without adding an untenable amount of overhead to set the random seed per SampleInput (see https://github.com/pytorch/pytorch/issues/86694#issuecomment-1614943708 for more details). For now, I am settling for at least spitting out the index and some metadata of the `SampleInput`, as it seems better than nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99444 Approved by: https://github.com/janeyx99	2023-11-21 00:11:20 +00:00
CaoE	455241bbd3	Add Half for aten2, logaddexp, logaddexp2, hypot, and nextafter on CPU (#112138 ) Add Half for aten2, logaddexp, logaddexp2, hypot, and nextafter on CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112138 Approved by: https://github.com/cpuhrsch	2023-11-06 06:01:29 +00:00
CaoE	26b5e27ace	Add Half support for cummax, cummin, cumprod, logcumsumexp, and prod on CPU (#112132 ) Add Half support for cummax, cummin, cumprod, logcumsumexp, and prod on CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112132 Approved by: https://github.com/cpuhrsch	2023-11-05 12:31:38 +00:00
CaoE	a310cc8968	Add Half support for kthvalue, cross, hist, and logit on CPU (#112135 ) Add Half support for kthvalue, cross, hist, and logit on CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112135 Approved by: https://github.com/cpuhrsch	2023-10-31 09:12:47 +00:00
Nikita Shulga	328a4c5475	[BE] Enhance `OpInfo.supported_dtype` (#111995 ) Current implementation is prone to errors, as it accepts any object, but does not print an error or something if device_type is not recognized. Remediate it by accepting both device-type and device identifies (either `torch.device` instance or "{device_type}:{ordinal}" string Fixes https://github.com/pytorch/pytorch/issues/111179 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111995 Approved by: https://github.com/albanD	2023-10-27 19:42:01 +00:00
Cao E	1c89ea7f72	Add Half support for softmax and log_softmax on CPU (#103315 ) Add Half support for softmax and log_softmax on CPU. Note: This introduces a correctness issue with MPS https://github.com/pytorch/pytorch/issues/111416 and https://github.com/pytorch/pytorch/issues/111479. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103315 Approved by: https://github.com/jgong5, https://github.com/mikaylagawarecki, https://github.com/malfet	2023-10-26 08:38:54 +00:00
Aaron Gokaslan	cb856b08b2	[BE]: Attach cause to some exceptions and enable RUFF TRY200 (#111496 ) Did some easy fixes from enabling TRY200. Most of these seem like oversights instead of intentional. The proper way to silence intentional errors is with `from None` to note that you thought about whether it should contain the cause and decided against it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111496 Approved by: https://github.com/malfet	2023-10-19 21:56:36 +00:00
CaoE	2a40b7efcb	Add Half support for addcmul, addcdiv, cumsum, and topk on CPU (#103319 ) Add Half support for addcmul, addcdiv, cumsum, and topk on CPU. Note: This PR will introduce the issue https://github.com/pytorch/pytorch/issues/111454. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103319 Approved by: https://github.com/jgong5, https://github.com/cpuhrsch	2023-10-19 17:47:45 +00:00
Philip Meier	973c87b320	raise instead of skip in test/test_meta.py (#110939 ) Supersedes #109004. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110939 Approved by: https://github.com/lezcano, https://github.com/kurtamohler	2023-10-17 10:17:43 +00:00
CaoE	9399e0b1ff	add fp16 support for gemm (#99498 ) ### Testing Native matmul vs. mkldnn matmul on SPR (with avx512_fp16 support) single core: Input \| Naïve impl / ms \| oneDNN / ms \| Speed up -- \| -- \| -- \| -- M: 128, N: 128, K: 128, trans_a: False, trans_b: False \| 2010.387 \| 64.700 \| 31.072 M: 128, N: 256, K: 128, trans_a: False, trans_b: False \| 4027.116 \| 107.780 \| 37.364 M: 8192, N: 768, K: 768, trans_a: False, trans_b: False \| 28685868.488 \| 90663.008 \| 316.401 56 cores: Input \| Naïve impl / ms \| oneDNN / ms \| Speed up -- \| -- \| -- \| -- M: 128, N: 128, K: 128, trans_a: False, trans_b: False \| 5.091 \| 0.24 \| 211.30 M: 128, N: 128, K: 128, trans_a: False, trans_b: True \| 5.224 \| 0.23 \| 220.09 M: 128, N: 256, K: 128, trans_a: False, trans_b: False \| 10.006 \| 0.30 \| 330.31 M: 8192, N: 768, K: 768, trans_a: False, trans_b: False \| 29435.372 \| 1.770 \| 1662.80 M: 8192, N: 768, K: 768, trans_a: False, trans_b: True \| 31464.961 \| 1.728 \| 18204.76 M: 8192, N: 768, K: 3072, trans_a: False, trans_b: False \| 115035.849 \| 7.990 \| 14396.90 M: 8192, N: 768, K: 3072, trans_a: False, trans_b: True \| 122981.023 \| 7.725 \| 15918.34 Batch: 768, M: 128, N: 64, K: 128 \| 2032.523 \| 0.705 \| 2882.23 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99498 Approved by: https://github.com/jgong5, https://github.com/malfet	2023-09-28 01:03:50 +00:00
Peter Bell	d796518485	[refs] Fix size check from #108360 (#109083 ) PR #108360 uses the same default `last_dim_size` formula from complex-to-real (C2R) transforms for complex-to-complex (C2C) and real-to-complex (R2C). However, this is not correct because for C2R the input is only half the size of the full tensor, which is not the case for C2C and C2R. This error is mostly benign since `last_dim_size` was only used for the `>= 1` condition which is almost always met anyway. For this PR I now use it as the argument to `_apply_norm` which makes it load-bearing for correctness and so is thoroughly tested now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109083 Approved by: https://github.com/lezcano	2023-09-27 23:59:29 +00:00
Jane Xu	0a60219fe3	[foreach] Fix 0-size handling for real for real (#109402 ) @crcrpar's last attempt to fix the 0-size problem unfortunately did not pass all cases. See my comment in https://github.com/pytorch/pytorch/issues/100701. When we have a tail tensor of size 0, the old code would mess with the chunk logic to check the previous tensor's length. This is flawed because: 1. if the previous tensor was also 0 sized, (so a tensor list of [tensor, tensor, tensor, ..., 0-sized tensor, 0-sized tensor],) chunks would still be 0 and the nested for loop would be missed. 2. the nested forloop pronounces side effects on tensorListMeta that _shouldn't_ be there! This can mess up the compute in unexpected ways that I haven't really needed to reason through. We noticed that the problem had not been fixed due to an internal report. This PR solves the issue by: - removing the finagling of chunks when the tail tensor is 0-sized - adding a surefire way for the kernel to be launched in the case where the last tensor is 0-sized AND there's content in the metadata, signifying there is stuff to compute still. ## test plan As I went through the code, I also added some comments explaining what's up and modified our tensor inputs to ensure that this case is tested in the test_parity test in test_foreach.py. Yes, I do realize there is quite a bit of duplication and that this file could be due for a refactor. That said, the primary goal of this PR is to fix the pretty egregious bug and refactoring can be a followup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109402 Approved by: https://github.com/albanD	2023-09-26 17:38:20 +00:00
jjsjann123	0d3db1048a	remove nvfuser test in upstream pytorch (#109918 ) Removing nvfuser related tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/109918 Approved by: https://github.com/msaroufim	2023-09-24 13:49:37 +00:00
Aaron Gokaslan	6d725e7d66	[BE]: enable ruff rules PLR1722 and PLW3301 (#109461 ) Enables two ruff rules derived from pylint: * PLR1722 replaces any exit() calls with sys.exit(). exit() is only designed to be used in repl contexts as may not always be imported by default. This always use the version in the sys module which is better * PLW3301 replaces nested min / max calls with simplified versions (ie. `min(a, min(b, c))` => `min(a, b. c)`). The new version is more idiomatic and more efficient. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109461 Approved by: https://github.com/ezyang	2023-09-18 02:07:21 +00:00
Masaki Kozuki	602413a0a0	Refactor `test_foreach.py` (#107869 ) ## Summary - Change the default of `supports_autograd` and `supports_forward_ad` of `ForeachFuncInfo` to `True` - Add `test_zero_size_tensor_inputs` to make sure that foreach functions can handle 0-size Tensor inputs - Add `test_parity` to check the consistency between outputs of foreach and for-loop of native function. - Add `test_autodiff` to check forward-mode and reverse-mode AD - Keep the corner cases that are not covered by the newly introduced methods rel: - #58833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107869 Approved by: https://github.com/janeyx99	2023-09-14 19:39:26 +00:00
Kiarash Jamali	fb288aa99b	Add Bfloat16 support to CrossKernel.cu (#108941 ) Fixes #108940 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108941 Approved by: https://github.com/mikaylagawarecki	2023-09-11 19:05:01 +00:00
ekamiti	0f88d93b10	decomposition spectral ops fixes (#108360 ) Fixes https://github.com/pytorch/pytorch/issues/105986, https://github.com/pytorch/pytorch/issues/108204, https://github.com/pytorch/pytorch/issues/108205 Fix all issues flagged when making changes for https://github.com/pytorch/pytorch/pull/107421 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108360 Approved by: https://github.com/ezyang	2023-09-09 04:48:09 +00:00
ekamiti	0ef2556351	Update sparse_funcs to include primtorch types (#107421 ) Fixes #107335. A few issues have been identified while enabling this test and filed: https://github.com/pytorch/pytorch/issues/105986 https://github.com/pytorch/pytorch/issues/108204 https://github.com/pytorch/pytorch/issues/108205 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107421 Approved by: https://github.com/ezyang	2023-09-05 14:34:48 +00:00
lezcano	239fed7e1e	Add reference for linalg.vecdot (#108188 ) Was addressing https://github.com/pytorch/pytorch/issues/108127, but then I realised that vecdot is already CompositeImplicit. Pushing anyway as a short-and-sweet PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108188 Approved by: https://github.com/peterbell10	2023-08-31 15:30:23 +00:00
Ken Jin	7349e8c1a1	Don't use `np.random` for TorchDynamo (#108009 ) Part of https://github.com/pytorch/pytorch/issues/107970 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108009 Approved by: https://github.com/lezcano	2023-08-28 17:18:40 +00:00
ekamiti	4a022e2185	Update unary_ufuncs groupings to include primtorch types. (#107345 ) Fixes #107335. The skips were updated for the _ref ops to match those for eager mode where necessary. Part of breakdown of https://github.com/pytorch/pytorch/pull/104489. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107345 Approved by: https://github.com/ezyang	2023-08-23 22:45:19 +00:00
ekamiti	017499b078	Update reduction_ops groupings to include primtorch types (#107338 ) Fixes https://github.com/pytorch/pytorch/issues/107335. The skips were updated for the _ref ops to match those for eager mode where necessary. Part of breakdown of https://github.com/pytorch/pytorch/pull/104489. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107338 Approved by: https://github.com/ezyang	2023-08-19 02:09:11 +00:00
Masaki Kozuki	b234b94760	Add in-place `_foreach_copy` (#107226 ) Fixes #107162 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107226 Approved by: https://github.com/janeyx99	2023-08-17 00:11:18 +00:00
Ivan Yashchuk	c913f3857f	Remove dynamo+nvfuser (#105789 ) This PR removes unmaintained Dynamo+nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789 Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD	2023-08-08 22:29:32 +00:00
PyTorch MergeBot	891bb259f8	Revert "Remove dynamo+nvfuser (#105789 )" This reverts commit `6030151d37`. Reverted https://github.com/pytorch/pytorch/pull/105789 on behalf of https://github.com/DanilBaibak due to Break a lot of tests on main. ([comment](https://github.com/pytorch/pytorch/pull/105789#issuecomment-1669710571))	2023-08-08 14:20:32 +00:00
Ivan Yashchuk	6030151d37	Remove dynamo+nvfuser (#105789 ) This PR removes unmaintained Dynamo+nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789 Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD	2023-08-08 13:29:31 +00:00

1 2 3 4

175 Commits