pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	fdab48a7c1	Enable all PIE rules on ruff (#165814 ) This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are ``` PIE796 Enum contains duplicate value: {value} PIE808 Unnecessary start argument in range ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814 Approved by: https://github.com/ezyang	2025-10-18 07:36:18 +00:00
PyTorch MergeBot	24520b8386	Revert "Enable all PIE rules on ruff (#165814 )" This reverts commit `c79dfdc655`. Reverted https://github.com/pytorch/pytorch/pull/165814 on behalf of https://github.com/cyyever due to Need to cover more files ([comment](https://github.com/pytorch/pytorch/pull/165814#issuecomment-3417931863))	2025-10-18 07:21:08 +00:00
Yuanyuan Chen	c79dfdc655	Enable all PIE rules on ruff (#165814 ) This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are ``` PIE796 Enum contains duplicate value: {value} PIE808 Unnecessary start argument in range ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814 Approved by: https://github.com/ezyang	2025-10-18 06:40:12 +00:00
vishalgoyal316	9c12651417	Improve error message for non-positive groups in convolution (#165669 ) Prevents from segmentation fault for invalid groups value in convolution. Fixes #142835 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165669 Approved by: https://github.com/mikaylagawarecki	2025-10-17 19:06:05 +00:00
Yuanyuan Chen	8de85896e0	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-13 01:48:55 +00:00
PyTorch MergeBot	816fb7f48d	Revert "Enable ruff rule E721 (#165162 )" This reverts commit `9e7c19f72b`. Reverted https://github.com/pytorch/pytorch/pull/165162 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165162#issuecomment-3393328271))	2025-10-11 13:25:40 +00:00
Yuanyuan Chen	9e7c19f72b	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-11 06:43:53 +00:00
Jeff Daily	4a6abba0d9	[ROCm][CI] test_convolution.py uses miopen immediate mode (#164598 ) This should help stabilize some flaky test behavior where miopen would pick different solutions for different parts of the same test and the test expects bitwise identical results. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164598 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-06 17:48:50 +00:00
Jeff Daily	6b7970192f	[ROCm][CI] fix test_cudnn_convolution_relu_cuda (#164466 ) Fixes #162816. Test was comparing output of conv vs fused conv but inputs were different memory formats. Also fix test_cudnn_convolution_add_relu. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164466 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-02 20:36:54 +00:00
Nikita Shulga	95be302889	Skip test_conv3d_cudnn_broken on ROCM (#164138 ) Followup after https://github.com/pytorch/pytorch/pull/163903 Fixes https://github.com/pytorch/pytorch/issues/164137 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164138 Approved by: https://github.com/Camyll	2025-09-29 16:56:51 +00:00
Eddie Yan	e2817ac204	[cuDNN][Convolution] Disable cuDNN for 3D convolutions with kernel size != 1 for cuDNN 9.8+ (#163581 ) To workaround #163539 Still confirming whether 9.10 is affected. The original test states that the convolution is "large," but note that the input size does not apepar to require 64-bit indexing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163581 Approved by: https://github.com/ngimel, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2025-09-26 23:47:29 +00:00
eqy	0ea10f9912	[cuDNN][conv][64-bit] Disable cuDNN for 64-bit depthwise convs again (#163171 ) test is breaking, will check if there's an older version that we can enable on to avoid completely dropping support Pull Request resolved: https://github.com/pytorch/pytorch/pull/163171 Approved by: https://github.com/ngimel, https://github.com/malfet	2025-09-26 22:12:17 +00:00
mansiag05	d4e4f70768	Fix overflow in slow_conv3d when kernel size is too large. (#162718 ) Also, adding check for padding to avoid segmentation fault caused by overflow. Fixes #141846 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162718 Approved by: https://github.com/jgong5, https://github.com/Skylion007	2025-09-26 13:39:29 +00:00
Yuanyuan Chen	281bb56cc5	Enable half precision types on test_conv_cudnn_nhwc_support (#163444 ) This PR adds flaot16 and bfloat16 cases to `test_conv_cudnn_nhwc_support` and removes outdated comments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163444 Approved by: https://github.com/Skylion007	2025-09-22 04:11:20 +00:00
Jeff Daily	0def79fdd9	[ROCm] fix conv relu fusion (#162856 ) Fixes #162816. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162856 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-15 22:49:32 +00:00
Jeff Daily	d65ffdef3d	[ROCm] fix miopen batchnorm changing output format (#162112 ) It was found that the integration of miopen batchnorm was causing the output to always be in default contig memory format even when the input was channels last. This also unskips a number of related unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162112 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com> Co-authored-by: Dmitry Nikolaev <dmitry.nikolaev@amd.com> Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>	2025-09-11 19:37:48 +00:00
eqy	5dbee5691c	[cuDNN][Convolution][TF32][64bit] Add `tf32_on_and_off` decorator to conv3d 64bit test (#161004 ) cuDNN has new generated kernels that can use TF32. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161004 Approved by: https://github.com/janeyx99, https://github.com/Skylion007	2025-09-10 21:39:35 +00:00
Jeff Daily	99f356fa58	[ROCm] revamp miopen integration (#161687 ) Update sources under ATen/miopen and ATen/native/miopen to align with best practices. Avoid reshape_ calls inside backward operations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161687 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-03 22:28:09 +00:00
Eddie Yan	f391afe9bf	[cuDNN][convolution] remove redundant conv3d 64bit test (#161177 ) turns out it's the same as ``` @onlyCUDA @largeTensorTest("40GB") @largeTensorTest("24GB", "cpu") @tf32_on_and_off(0.005) def test_conv3d_64bit_indexing(self, device): x = torch.rand(1, 32, 512, 512, 256) m = torch.nn.Conv3d(32, 1, kernel_size=1, padding=0, stride=1, bias=False) yref = m(x) y = m.to(device=device)(x.to(device=device)) self.assertEqual(yref, y) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/161177 Approved by: https://github.com/Skylion007	2025-08-25 15:01:05 +00:00
eqy	9903ca4f70	[cuDNN][64-bit indexing] update conv depthwise 64bit indexing dispatch condition to match native kernel (#156140 ) The native kernel doesn't support batch splitting so the previous check wasn't aggressive enough in dispatching to cuDNN https://github.com/pytorch/pytorch/issues/155225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156140 Approved by: https://github.com/ngimel, https://github.com/atalman	2025-08-12 18:07:41 +00:00
Nikita Shulga	e06b110f73	[Testing] Add MPS to NATIVE_DEVICES (#153835 ) This would allow me to enable more opinfo tests against MPS device eventually and supposed to be a very simple test, but actually required minor adjustments to lots of test files, namely: - Introduce `all_mps_types_and` that is very similar to `all_types_and`, but skips `float64` - Decorate lots of tests with `@dtypesIfMPS(*all_mps_types())` - Skip `test_from_dlpack_noncontinguous` as it currently crashes (need to be fixed) - Add lots of `expectedFailureIfMPS` - Delete all `@onlyNativeDeviceTypesAnd("mps")` <sarcasm> I love how well documented this variable are </sarcasm> Pull Request resolved: https://github.com/pytorch/pytorch/pull/153835 Approved by: https://github.com/Skylion007	2025-08-05 18:57:35 +00:00
eqy	c89fa88acb	[conv][cuDNN][64-bit indexing] reduce memory usage of depthwise conv 64-bit indexing test (#158981 ) Use half instead for reduced memory usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/158981 Approved by: https://github.com/soulitzer, https://github.com/Skylion007	2025-07-25 23:58:45 +00:00
PyTorch MergeBot	317af4c87b	Revert "[cuDNN][64-bit indexing] update conv depthwise 64bit indexing dispatch condition to match native kernel (#156140 )" This reverts commit `a5f59cc2ea`. Reverted https://github.com/pytorch/pytorch/pull/156140 on behalf of https://github.com/atalman due to breaks internal builds ([comment](https://github.com/pytorch/pytorch/pull/156140#issuecomment-2988441548))	2025-06-19 15:09:29 +00:00
eqy	a5f59cc2ea	[cuDNN][64-bit indexing] update conv depthwise 64bit indexing dispatch condition to match native kernel (#156140 ) The native kernel doesn't support batch splitting so the previous check wasn't aggressive enough in dispatching to cuDNN https://github.com/pytorch/pytorch/issues/155225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156140 Approved by: https://github.com/ngimel	2025-06-18 17:32:36 +00:00
eqy	bd3c32916c	[cuDNN] Enabled dilation for deterministic convolutions in cuDNN (#154292 ) Provides order-of-magnitude speedup over fallback impl. https://github.com/pytorch/pytorch/issues/28777 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154292 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2025-06-11 23:35:52 +00:00
Joona Havukainen	981bdb39ca	Enable ConvTranspose3D for FP32 and Complex64 (#154696 ) Fixes #154615 Enables using ConvTranspose3D since it seems support exists both on MacOS 14 and 15. For the half dtypes the discrepancy of CPU and GPU implementations is too large to conclude whether there is a bug in the implementation or not without a more rigorous study on what bounds are there to the expected error. So they are left unsupported for now and an assert is added to notify the user if the op is called with fp16 or bf16 inputs. Tests for ConvTranspose3D were enabled for the supported data types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154696 Approved by: https://github.com/malfet	2025-06-02 16:24:03 +00:00
Aaron Gokaslan	dbad6d71c7	[BE][Ez]: Unskip conv1d MPS test (#154795 ) Fixes issue I noticed where conv1d test is skipped for complex types unconditionally Pull Request resolved: https://github.com/pytorch/pytorch/pull/154795 Approved by: https://github.com/jansel	2025-05-31 23:01:19 +00:00
eqy	823a35807c	[CUDA][CUDNN] Dispatch to cuDNN for non-batch-splittable 64-bit NCHW convolutions (#153101 ) For #152816 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153101 Approved by: https://github.com/Skylion007	2025-05-20 20:19:03 +00:00
PyTorch MergeBot	bf0fe4f828	Revert "[CUDA][CUDNN] Dispatch to cuDNN for non-batch-splittable 64-bit NCHW convolutions (#153101 )" This reverts commit `ced90d23d3`. Reverted https://github.com/pytorch/pytorch/pull/153101 on behalf of https://github.com/jeanschmidt due to Seems to have introduced breakages on main, tentative revert: https://github.com/pytorch/pytorch/actions/runs/15024667248/job/42224521705 ([comment](https://github.com/pytorch/pytorch/pull/153101#issuecomment-2881208171))	2025-05-14 18:52:07 +00:00
eqy	ced90d23d3	[CUDA][CUDNN] Dispatch to cuDNN for non-batch-splittable 64-bit NCHW convolutions (#153101 ) For #152816 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153101 Approved by: https://github.com/Skylion007	2025-05-14 15:22:47 +00:00
Eddie Yan	ec68d082a1	[CUDA][TF32] Account for TF32 in `test_conv2d_same_padding` (#152618 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152618 Approved by: https://github.com/msaroufim, https://github.com/Skylion007	2025-05-02 20:19:00 +00:00
Jagadish Krishnamoorthy	0d99b4e9e2	ROCm: Enable tf32 testing on test_nn (#148945 ) Add tf32 support for ROCm tests. test command: python test/test_nn.py -v Pull Request resolved: https://github.com/pytorch/pytorch/pull/148945 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-04-28 23:01:04 +00:00
Alvaro-Kothe	8ce3d4a541	test(Conv3d): use correct class for `test_Conv3d_module_same_padding` (#152187 ) The test for the class `Conv3d` is calling `Conv2d`. This PR just ensure that we are testing the correct module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152187 Approved by: https://github.com/Skylion007	2025-04-28 16:59:12 +00:00
cyy	970fefcc53	Remove outdated skipCUDAIfCudnnVersionLessThan decoration (#148940 ) Test conditions for CUDNN 7 and 8 were removed because we have moved to CUDNN 9. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148940 Approved by: https://github.com/mikaylagawarecki	2025-03-13 18:02:50 +00:00
cyy	a5f6b24d87	Remove outdated skipIfRocmVersionLessThan decorations (#148941 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/148941 Approved by: https://github.com/jeffdaily	2025-03-11 18:37:40 +00:00
Jeff Daily	44248c44eb	[ROCm] miopen benchmark behavior now better aligns with cudnn (#145294 ) The default benchmark setting is now false. The new miopen behavior means when benchmarking is disabled, for any shape that doesn't have a find hit, then it will do a quick search (same behavior as the prior default), and use that result. Now when benchmark is enabled, it will perform an exhaustive search and update any DBs. miopen immediate mode is still available and is used when deterministic is true and benchmark is false. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145294 Approved by: https://github.com/BrianHarrisonAMD, https://github.com/malfet	2025-02-05 17:19:53 +00:00
Benjamin Glass	5aa5a5763e	[inductor triton] Disable incorrect TF32 usage on CUDA capability < 8 (#145684 ) Triton 2.2 and greater have a bug where allowing TF32 generation for a GPU that does not support TF32 will cause code generation errors. Patch around this problem by: 1. Adding a function to `torch.cuda` that determines whether CUDA hardware is capable of using the TF32 format. 2. Using that function to explicitly disable TF32 generation when calling Triton, where needed. To demonstrate that this fix works, try running `test/inductor/test_max_autotune.py` on a GPU with CUDA compute capability < 8 (e.g. any NVIDIA consumer GPU) without this fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145684 Approved by: https://github.com/eqy	2025-01-28 22:01:08 +00:00
PyTorch MergeBot	6a4fb4b615	Revert "Align CPU behavior with CUDA for `ConvTranspose` when `out_channels=0` (#142859 )" This reverts commit `cb814c0b96`. Reverted https://github.com/pytorch/pytorch/pull/142859 on behalf of https://github.com/malfet due to It broke ROCM tests again, see `5cd2b34e82/1` ([comment](https://github.com/pytorch/pytorch/pull/142859#issuecomment-2614523822))	2025-01-26 17:49:05 +00:00
Wu, Chunyuan	cb814c0b96	Align CPU behavior with CUDA for `ConvTranspose` when `out_channels=0` (#142859 ) Fixes https://github.com/pytorch/pytorch/issues/142466. Remove the `weight.numel() != 0` check to align the behavior with CUDA for `ConvTranspose` when `out_channels=0`. After removing this check, the existing code is already able to give an empty output in such case. Test plan: ``` python -u test/nn/test_convolution.py -k test_ConvTranspose_output_channels_0_cpu_float32 python -u test/nn/test_convolution.py -k test_ConvTranspose_output_channels_0_cuda_float32 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/142859 Approved by: https://github.com/mingfeima, https://github.com/malfet	2025-01-26 01:56:40 +00:00
PyTorch MergeBot	d95a6babcc	Revert "Align CPU behavior with CUDA for `ConvTranspose` when `out_channels=0` (#142859 )" This reverts commit `0bff377880`. Reverted https://github.com/pytorch/pytorch/pull/142859 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the XLA failures look legit ([comment](https://github.com/pytorch/pytorch/pull/142859#issuecomment-2608631019))	2025-01-23 01:10:31 +00:00
Wu, Chunyuan	0bff377880	Align CPU behavior with CUDA for `ConvTranspose` when `out_channels=0` (#142859 ) Fixes https://github.com/pytorch/pytorch/issues/142466. Remove the `weight.numel() != 0` check to align the behavior with CUDA for `ConvTranspose` when `out_channels=0`. After removing this check, the existing code is already able to give an empty output in such case. Test plan: ``` python -u test/nn/test_convolution.py -k test_ConvTranspose_output_channels_0_cpu_float32 python -u test/nn/test_convolution.py -k test_ConvTranspose_output_channels_0_cuda_float32 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/142859 Approved by: https://github.com/mingfeima, https://github.com/malfet	2025-01-22 17:52:53 +00:00
Tom Ritchford	eaef613688	Fix issue with test/nn/test_convolution:TestConvolutionNNDeviceTypeCUDA.test_conv_large_batch_1_cuda (#145067 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145067 Approved by: https://github.com/Skylion007, https://github.com/nWEIdia Co-authored-by: Wei Wang <143543872+nWEIdia@users.noreply.github.com>	2025-01-17 20:31:25 +00:00
Tom Ritchford	c947a7d38e	Fix unused Python variables in test/nn (#143396 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143396 Approved by: https://github.com/mikaylagawarecki	2024-12-18 03:30:54 +00:00
Nikita Shulga	9c88b08ac9	[BE] Replace `skipIfMPS` with `expectedFailureMPS` (#139940 ) Functionally two decorators are very similar, but one should rely on expectedFailure as much as possible to get signal when something is fixed. - Move `product_version` variable from `test_mps` to common_utils, but call it `MACOS_VERSION` - Introduce `skipIfMPSOnMacOS13` to decorate the hard crashes that happens only on MacOS13 (which at this point will not get any fixes and will be deprecated soon) - Add `device_type='mps'` to all `skipIfMPS` per https://github.com/pytorch/pytorch/issues/140560 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139940 Approved by: https://github.com/janeyx99, https://github.com/huydhn	2024-11-15 03:48:37 +00:00
Eddie Yan	846b4e614b	[TF32][cuDNN][Convolution] Add some missing TF32 decorators (#138768 ) Newer cuDNN versions seem to be able to dispatch to cuDNN kernels Pull Request resolved: https://github.com/pytorch/pytorch/pull/138768 Approved by: https://github.com/Skylion007	2024-10-25 19:03:42 +00:00
Siddharth Kotapati	e27c0048db	Enable additional tests for MPS CI runs (#134356 ) As part of the follow up for https://github.com/pytorch/pytorch/issues/133520, adapting existing unused tests for use in MPS CI runs. Focusing on nhwc & other memory formatting tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/134356 Approved by: https://github.com/malfet, https://github.com/eqy, https://github.com/huydhn	2024-10-04 21:52:38 +00:00
Mikayla Gawarecki	d9576c9440	Fix failures when default is flipped for weights_only (#127627 ) Tests on XLA shard not fixed yet but there is an issue here https://github.com/pytorch/xla/issues/7799 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127627 Approved by: https://github.com/albanD ghstack dependencies: #132349	2024-08-16 00:22:43 +00:00
Xuehai Pan	fbe6f42dcf	[BE][Easy][8/19] enforce style for empty lines in import segments in `test/[k-p]*/` (#129759 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129759 Approved by: https://github.com/justinchuby, https://github.com/ezyang	2024-07-31 02:09:20 +00:00
eellison	28f29e074b	Dont mutate tensor stride in place in cudnn conv (#126786 ) Fix for https://github.com/pytorch/pytorch/issues/126241. Within the cudnn convolution, we were in-place updating the strides of the tensor to disambiguate for size-1 dims and contiguous and channels last tensors. Instead of mutating the tensors stride, just use a temporary. Inside cudnn it is then copied: `d7ccb5b3c4/include/cudnn_frontend_Tensor.h (L201-L203)`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126786 Approved by: https://github.com/ezyang, https://github.com/shunting314, https://github.com/eqy	2024-05-22 01:53:44 +00:00
eqy	973d724e21	[CUDA] Fix 64-bit indexing in `vol2col` in conv3d (#124650 ) Similar to #118005, fixes sometimes silent IMAs that occur CC @atalman @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/124650 Approved by: https://github.com/soulitzer	2024-04-25 23:21:43 +00:00

1 2

80 Commits