pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Scott Todd	ee89cc7a0a	[ROCm][Windows] Fix LoadHIP handling of environment variable paths on Windows. (#159080 ) See https://cmake.org/cmake/help/latest/command/file.html#path-conversion. Paths stored in environment variables may use `/` or `\` (e.g. on Windows), while cmake-style paths always use `/`. This fixes configure errors like: ``` CMake Error at D:/b/pytorch_main/build/CMakeFiles/CMakeScratch/TryCompile-srhq07/CMakeLists.txt:2 (set): Syntax error in cmake code at D:/b/pytorch_main/build/CMakeFiles/CMakeScratch/TryCompile-srhq07/CMakeLists.txt:2 when parsing string D:\projects\TheRock\external-builds\pytorch\.venv\Lib\site-packages\_rocm_sdk_devel/cmake/;D:/b/pytorch_main/cmake/Modules Invalid character escape '\p'. CMake Error at D:/projects/TheRock/external-builds/pytorch/.venv/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/Internal/CheckSourceCompiles.cmake:108 (try_compile): Failed to configure test project build system. ``` (note the mixed usage of `\` and `/` in that string) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159080 Approved by: https://github.com/jeffdaily	2025-08-12 00:18:19 +00:00
tvukovic-amd	a23f4471b9	[ROCm][Windows] Fix finding ROCm/HIP version (#156486 ) This commit fixes Windows build issue related to trying to use rocm-core (rocm-core doesn't exist on HIP SDK) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156486 Approved by: https://github.com/jeffdaily, https://github.com/stellaraccident	2025-07-16 15:31:43 +00:00
tvukovic-amd	b2d473c8f8	[ROCm][Windows] Fix rocsolver undefined symbol error (#156591 ) Fix undefined symbol error while using `rocsolver_ssyevd_strided_batched` call in `aten/src/ATen/native/cuda/linalg/BatchLinearAlgebraLib.cpp`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156591 Approved by: https://github.com/jeffdaily	2025-06-24 03:28:45 +00:00
Jeff Daily	30d3cf62fb	support CUBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F (#154680 ) Requires CUDA >= 12.9 and sm_90. hipBLASLt has a similar enum but is not available until ROCm 7.0. Support the new enum early using a cmake test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154680 Approved by: https://github.com/malfet, https://github.com/atalman	2025-06-18 18:39:01 +00:00
Stella Laurenzo	10cd1de518	[ROCm] Make optional features in LoadHIP better conditioned. (#155305 ) * The `rocm-core` CMake package only started appearing in ROCm 6.4, so rework the version probing to work if it is not present. Also collapses the unneeded operating system conditioning in favor of feature probing. * Make `hipsparselt` optional: it only started appearing in ROCm 6.4 and it is not in all recent distribution channels yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155305 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-06-07 02:20:55 +00:00
Peter Y. Yeh	43390d8b13	ROCm Sparsity through HipSparseLT (#150578 ) TLDR: - This pull request introduces support for hipSPARSELt in ROCm, current usage would be semi-structure sparsity. - Require ROCm 6.4 && gfx942/gfx950. - The average performance uplift (compare to dense operation) is ~ 20% in ROCm 6.4 but expect further performance lift along the way. ### Dense vs. Sparse Performance Comparison #### NT (Row-major) Average Uplift: `1.20` \| M \| N \| K \| hipsparselt-bench (us) \| hipblaslt-bench get all (us) \| Uplift \| \|-------\|--------\|--------\|-------------------------\|-------------------------------\|--------\| \| 14336 \| 8 \| 4096 \| 20.05 \| 25.3 \| 1.26 \| \| 4096 \| 8 \| 14336 \| 21.07 \| 25.28 \| 1.20 \| \| 3072 \| 3072 \| 10240 \| 299.05 \| 351.82 \| 1.18 \| \| 3072 \| 1536 \| 768 \| 18.56 \| 20.05 \| 1.08 \| \| 3072 \| 17664 \| 768 \| 163.13 \| 173.91 \| 1.07 \| \| 3072 \| 196608 \| 768 \| 1717.30 \| 1949.63 \| 1.14 \| \| 3072 \| 24576 \| 768 \| 206.84 \| 242.98 \| 1.17 \| \| 3072 \| 6144 \| 768 \| 53.90 \| 56.88 \| 1.06 \| \| 3072 \| 98304 \| 768 \| 833.77 \| 962.28 \| 1.15 \| \| 768 \| 1536 \| 768 \| 8.53 \| 19.65 \| 2.30 \| \| 768 \| 17664 \| 768 \| 46.02 \| 46.84 \| 1.02 \| \| 768 \| 196608 \| 768 \| 463.15 \| 540.46 \| 1.17 \| \| 768 \| 24576 \| 768 \| 54.32 \| 59.55 \| 1.10 \| \| 768 \| 6144 \| 768 \| 19.47 \| 20.15 \| 1.03 \| \| 768 \| 98304 \| 768 \| 231.88 \| 258.73 \| 1.12 \| --- #### NN (Row-major) Average Uplift: `1.13` \| M \| N \| K \| hipsparselt-bench (us) \| hipblaslt-bench get all (us) \| Uplift \| \|-----\|--------\|-------\|-------------------------\|-------------------------------\|--------\| \| 768 \| 1536 \| 3072 \| 27.50 \| 28.78 \| 1.05 \| \| 768 \| 17664 \| 3072 \| 125.06 \| 158.94 \| 1.27 \| \| 768 \| 196608 \| 3072 \| 1568.38 \| 1767.12 \| 1.13 \| \| 768 \| 24576 \| 3072 \| 171.05 \| 203.49 \| 1.19 \| \| 768 \| 6144 \| 3072 \| 58.72 \| 60.39 \| 1.03 \| \| 768 \| 98304 \| 3072 \| 787.15 \| 887.60 \| 1.13 \| ------------------------- This pull request introduces support for hipSPARSELt in ROCm, alongside various updates and improvements to the codebase and test suite. The changes primarily involve adding configuration flags, updating conditional checks, and ensuring compatibility with hipSPARSELt. ### ROCm and hipSPARSELt Support: * [`BUILD.bazel`](diffhunk://#diff-7fc57714ef13c3325ce2a1130202edced92fcccc0c6db34a72f7b57f60d552a3R292): Added `@AT_HIPSPARSELT_ENABLED@` substitution to enable hipSPARSELt support. * [`aten/CMakeLists.txt`](diffhunk://#diff-0604597797bb21d7c39150f9429d6b2ace10b79ab308514ad03f76153ae8249bR104-R110): Introduced a conditional flag to enable hipSPARSELt support based on ROCm version. * [`aten/src/ATen/CMakeLists.txt`](diffhunk://#diff-ce80f3115ab2f6be5142f0678a1fc92c6b2d7727766ce44f48726c99e720f777R37): Added `AT_HIPSPARSELT_ENABLED` configuration. * [`aten/src/ATen/cuda/CUDAConfig.h.in`](diffhunk://#diff-8bb82da825ca87c28233abacffa1b0566c73a54990b7a77f3f5108d3718fea15R11): Defined `AT_HIPSPARSELT_ENABLED` macro. * `caffe2/CMakeLists.txt`, `cmake/Dependencies.cmake`, `cmake/public/LoadHIP.cmake`: Included hipSPARSELt in the ROCm dependencies. [[1]](diffhunk://#diff-c5ee05f1e918772792ff6f2a3f579fc2f182e57b1709fd786ef6dc711fd68b27R1380) [[2]](diffhunk://#diff-12e8125164bbfc7556b1781a8ed516e333cc0bf058acb7197f7415be44606c72L1084-R1084) [[3]](diffhunk://#diff-b98e27b9a5f196a6965a99ee5a7bb15b3fc633d6375b767635b1b04ccb2fd3d5R153) ### Codebase Updates: * [`aten/src/ATen/native/sparse/cuda/cuSPARSELtOps.cpp`](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3R1-R6): Added hipSPARSELt support checks and initialization functions. Updated various methods to conditionally handle hipSPARSELt. [[1]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3R1-R6) [[2]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3R22-R67) [[3]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3R78-R85) [[4]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3R97-R109) [[5]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3R183-R188) [[6]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3L134-R200) [[7]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3R213-R222) [[8]](diffhunk://#diff-ae921dd1584ab98fdd9c25a3521047795de702223f5b65fdaa45a5bd92b4d1f3L217-R285) ### Test Suite Updates: * [`test/test_sparse_semi_structured.py`](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR50-R65): Added checks for hipSPARSELt availability and updated test conditions to skip tests not supported on ROCm. [[1]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR50-R65) [[2]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR228) [[3]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR239) [[4]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR250) [[5]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR579) [[6]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR624) [[7]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR661) [[8]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR695) [[9]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR730) [[10]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR755) [[11]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR771) [[12]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR809) [[13]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR844) [[14]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cL840-R854) [[15]](diffhunk://#diff-b7b57bc1e34145ef89c7929751d5d26aeecc8edfb37da9c60e9d3f0a1335133cR1005) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150578 Approved by: https://github.com/jeffdaily	2025-05-31 02:03:40 +00:00
Jithun Nair	f11d7a5978	[ROCm] Update spack includes (#152569 ) * Cleans up code in `caffe2/CMakeLists.txt` to remove individual ROCm library include paths and use `ROCM_INCLUDE_DIRS` CMake var instead * `ROCM_INCLUDE_DIRS` CMake var is set in `cmake/public/LoadHIP.cmake` by adding all the ROCm packages that PyTorch depends on * `rocm_version.h` is provided by the `rocm-core` package, so use the include directory for that component to be compliant with Spack * Move `find_package_and_print_version(hip REQUIRED CONFIG)` earlier so that `hip_version.h` can be located in the hip package include dir for Spack * `list(REMOVE_DUPLICATES ROCM_INCLUDE_DIRS)` to remove duplicate `/opt/rocm/include` entries in the non-Spack case * Remove user-provided env var `ROCM_INCLUDE_DIRS` since `ROCM_PATH` already exists as a user-provided env var, which should be sufficient to locate the include directories for ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152569 Approved by: https://github.com/renjithravindrankannath, https://github.com/jeffdaily Co-authored-by: Renjith Ravindran <Renjith.RavindranKannath@amd.com>	2025-05-09 21:36:38 +00:00
Faa Diallo	423e4a4568	[ROCm] cmake 4 workaround for hiprtc (#150324 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150324 Approved by: https://github.com/jeffdaily, https://github.com/atalman, https://github.com/malfet	2025-03-31 21:55:53 +00:00
Nichols A. Romero	c0af782f30	[ROCm] Change LoadHIP to use find_file for rocm_version.h (#149983 ) Fixes #149805 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149983 Approved by: https://github.com/jeffdaily	2025-03-26 21:26:41 +00:00
Michal Gallus	b706044cca	[ROCm][Windows] Enable hipblaslt for Windows (#148563 ) This PR adds hipblaslt library as one of the Windows' dependencies. `rocBLAS` is added too, since certain symbols aren't detected with `hipblas` alone on Windows. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148563 Approved by: https://github.com/jeffdaily	2025-03-10 21:07:16 +00:00
Peter Yeh	81dccd706b	[ROCm] OCP FP8 Support for new GPUs (#146632 ) TLDR: Follow up/ Build on top of https://github.com/pytorch/pytorch/pull/144476. add OCP FP8 support for gfx950 refer to https://github.com/pytorch/ao/pull/1677 This pull request includes several changes to improve compatibility and support for new GPU architectures and data types, particularly for ROCm. The key updates involve adding support for new ROCm versions and GPU architectures, updating data type handling, and removing outdated checks. ### Improvements to GPU Architecture and ROCm Version Support: * [`aten/src/ATen/Context.cpp`](diffhunk://#diff-33de472d304acbe57d693c8567370c638068bedc1aa0ce8e9dc115dad05a7810L323-R326): Added support for new GPU architectures `gfx1200`, `gfx1201`, and `gfx950` based on ROCm version checks. * [`aten/src/ATen/native/cuda/Blas.cpp`](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abL196-R199): Updated architecture support in multiple functions to include `gfx1200`, `gfx1201`, and `gfx950` based on ROCm version checks. [[1]](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abL196-R199) [[2]](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abL865-R876) ### Updates to Data Type Handling: * [`aten/src/ATen/cuda/CUDADataType.h`](diffhunk://#diff-9188bb13b1a49f459141f5f9b875593d1c5ce2beb5ad711fdbaf5bc7089ec015L81-L98): Enhanced data type conversion to include new float8 types for both CUDA and ROCm environments. * [`aten/src/ATen/cuda/tunable/GemmHipblaslt.h`](diffhunk://#diff-bfa1a3b5d4bef1892bf50338775f3b0fd8cd31fc1868148f3968b98aefb68e3fL29-R80): Updated `HipDataTypeFor` template to handle new float8 types and added hard-coded enum values for ROCm versions prior to 6.3. ### Removal of Outdated Checks: * [`cmake/public/LoadHIP.cmake`](diffhunk://#diff-b98e27b9a5f196a6965a99ee5a7bb15b3fc633d6375b767635b1b04ccb2fd3d5L169-L197): Removed the check for `HIP_NEW_TYPE_ENUMS` as it is no longer necessary with the updated ROCm versions. [[1]](diffhunk://#diff-b98e27b9a5f196a6965a99ee5a7bb15b3fc633d6375b767635b1b04ccb2fd3d5L169-L197) [[2]](diffhunk://#diff-b98e27b9a5f196a6965a99ee5a7bb15b3fc633d6375b767635b1b04ccb2fd3d5L211-R182) These changes ensure better compatibility and performance on newer hardware and software environments, particularly for users leveraging ROCm and CUDA for deep learning and scientific computing tasks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146632 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-02-24 22:47:52 +00:00
PyTorch MergeBot	3e2d9d079e	Revert "[ROCm] OCP FP8 Support for new GPUs (#146632 )" This reverts commit `f95ab46797`. Reverted https://github.com/pytorch/pytorch/pull/146632 on behalf of https://github.com/jeanschmidt due to Breaking internal builds, I'll find someone to help merge this PR back to main ([comment](https://github.com/pytorch/pytorch/pull/146632#issuecomment-2676823614))	2025-02-23 12:04:50 +00:00
Peter Yeh	f95ab46797	[ROCm] OCP FP8 Support for new GPUs (#146632 ) TLDR: Follow up/ Build on top of https://github.com/pytorch/pytorch/pull/144476. add OCP FP8 support for gfx950 refer to https://github.com/pytorch/ao/pull/1677 This pull request includes several changes to improve compatibility and support for new GPU architectures and data types, particularly for ROCm. The key updates involve adding support for new ROCm versions and GPU architectures, updating data type handling, and removing outdated checks. ### Improvements to GPU Architecture and ROCm Version Support: * [`aten/src/ATen/Context.cpp`](diffhunk://#diff-33de472d304acbe57d693c8567370c638068bedc1aa0ce8e9dc115dad05a7810L323-R326): Added support for new GPU architectures `gfx1200`, `gfx1201`, and `gfx950` based on ROCm version checks. * [`aten/src/ATen/native/cuda/Blas.cpp`](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abL196-R199): Updated architecture support in multiple functions to include `gfx1200`, `gfx1201`, and `gfx950` based on ROCm version checks. [[1]](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abL196-R199) [[2]](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abL865-R876) ### Updates to Data Type Handling: * [`aten/src/ATen/cuda/CUDADataType.h`](diffhunk://#diff-9188bb13b1a49f459141f5f9b875593d1c5ce2beb5ad711fdbaf5bc7089ec015L81-L98): Enhanced data type conversion to include new float8 types for both CUDA and ROCm environments. * [`aten/src/ATen/cuda/tunable/GemmHipblaslt.h`](diffhunk://#diff-bfa1a3b5d4bef1892bf50338775f3b0fd8cd31fc1868148f3968b98aefb68e3fL29-R80): Updated `HipDataTypeFor` template to handle new float8 types and added hard-coded enum values for ROCm versions prior to 6.3. ### Removal of Outdated Checks: * [`cmake/public/LoadHIP.cmake`](diffhunk://#diff-b98e27b9a5f196a6965a99ee5a7bb15b3fc633d6375b767635b1b04ccb2fd3d5L169-L197): Removed the check for `HIP_NEW_TYPE_ENUMS` as it is no longer necessary with the updated ROCm versions. [[1]](diffhunk://#diff-b98e27b9a5f196a6965a99ee5a7bb15b3fc633d6375b767635b1b04ccb2fd3d5L169-L197) [[2]](diffhunk://#diff-b98e27b9a5f196a6965a99ee5a7bb15b3fc633d6375b767635b1b04ccb2fd3d5L211-R182) These changes ensure better compatibility and performance on newer hardware and software environments, particularly for users leveraging ROCm and CUDA for deep learning and scientific computing tasks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146632 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-02-21 23:44:08 +00:00
Michal Gallus	3f5ed05688	[Windows][ROCm] Fix c10 hip tests (#146599 ) - Solves a problem related to .hip source files being ignored by the build system when HIP language is not enabled in CMake. - Also ensures that the test executables link to an appropriate CRT Runtime Library and hence have access to all the necessary symbols. Previously, there were many problems related to linkage errors. - Moves part of Linux-related hipBLASLt changes in `LoadHIP.cmake` under the UNIX conditional branch, as these aren't supported on Windows yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146599 Approved by: https://github.com/jeffdaily	2025-02-06 23:41:25 +00:00
Jeff Daily	6ac0616504	[ROCm] hipblaslt rowwise f8 gemm (#144432 ) hipblaslt added rowwise f8 gemm support. Integrate with scaled_mm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144432 Approved by: https://github.com/drisspg	2025-01-15 18:23:44 +00:00
Michal Gallus	4cbb3b4bd2	[ROCm] Enable finding HIP and ROCm libraries on Windows (#137279 ) This PR introduces support for finding HIP-SDK Libraries on Windows. Since reading the code changes using the diff view is a bit cumbersome due to introduced if branch, let me explain what was changed: - The linux-specific steps to find HIP packages have been dragged into `if(UNIX) block` - Windows steps follow in the `else()` clause The separation was needed, because of several factors: - HIP SDK for Windows typically names its components using `hip` in their names (for exmaple: `hip_version.h` instead of `rocm_version.h`, `HIP_VERSION_DEV_MAJOR` instead of `ROCM_VERSION_DEV_MAJOR`, etc.), - The libraries included in HIP SDK are only a subset of what is available in Linux ROCm (missing hsa-rt, rccl, roctx) - MIOpen isn't a part of HIP SDK, but can be built separately and as of now requires additional path to be defined using and env var. - Windows can only find hip package in version greater than 1.0 and its libraries if the lowercase `find_package(hip ...)` is invoked first. This is because the lowercase `hip` name will cause the mechanism to find hip's packages using [config mode](https://cmake.org/cmake/help/latest/command/find_package.html#search-modes) which is the only one supported on Windows, assuming we also want to [include its libraries](https://rocm.docs.amd.com/en/latest/conceptual/cmake-packages.html#consuming-the-hip-api-in-c-code). The upper-case module-mode-seearched `find_package(HIP)` is used later for inclusion of macros such as `hip_add_library` and related macros. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137279 Approved by: https://github.com/jeffdaily	2024-12-03 03:26:01 +00:00
Nichols A. Romero	bd63ec4f45	[ROCm] LoadHIP CMake cleanup (#137112 ) Should help mitigate issues reported here: https://github.com/pytorch/pytorch/issues/128313 While working on https://github.com/pytorch/pytorch/pull/136700, we realized that some of the ROCm CMake can be streamlined. This PR does not fix any bugs or provide any new functionality. Strictly clean-up. The remaining `${ROCM_ROCTX_LIB}` will be removed when we transition to the rocprofiler-sdk (to be done in a separate PR). Pull Request resolved: https://github.com/pytorch/pytorch/pull/137112 Approved by: https://github.com/jithunnair-amd, https://github.com/jeffdaily	2024-10-13 00:06:41 +00:00
Nichols A. Romero	b9c9f7f0fa	Document ROCm environment variables and improve CMake messaging to user (#137308 ) Fixes #115725. Note that the github issue title is misleading. Read the comments to understand what the problem is really about. The PR improves the documentation and CMake's behavior for ROCM builds. - Documentation: There were two environment variables for ROCm builds that are now documented. `ROCM_PATH` and `PYTORCH_ROCM_ARCH`. - CMake: Improved diagnostic messaging and error handling with respect to `ROCM_PATH` Pull Request resolved: https://github.com/pytorch/pytorch/pull/137308 Approved by: https://github.com/pruthvistony, https://github.com/jithunnair-amd, https://github.com/jeffdaily	2024-10-10 03:08:08 +00:00
Nichols A. Romero	e8f1dd6ba0	Fix hardcoded ROCm paths in `Caffe2Targets.cmake` (#136283 ) Fixes #131701 Use CMake imported targets more consistently to eliminate hardcode paths. Here is the new relevant sections of Caffe2Targets.cmake: ``` set_target_properties(c10_hip PROPERTIES INTERFACE_INCLUDE_DIRECTORIES "${_IMPORT_PREFIX}/include" INTERFACE_LINK_LIBRARIES "c10;hip::amdhip64" ) ``` ``` set_target_properties(torch_hip PROPERTIES INTERFACE_COMPILE_DEFINITIONS "USE_C10D_NCCL" INTERFACE_COMPILE_OPTIONS "-fPIC;-D__HIP_PLATFORM_AMD__=1;-DCUDA_HAS_FP16=1;-DUSE_ROCM;-D__HIP_NO_HALF_OPERATORS__=1;-D__HIP_NO_HALF_CONVERSIONS__=1;-DTORCH_HIP_VERSION=602;-Wno-shift-count-negative;-Wno-shift-count-overflow;-Wno-duplicate-decl-specifier;-DCAFFE2_USE_MIOPEN;-DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_HIP;-std=c++17;-DHIPBLAS_V2;-DHIP_NEW_TYPE_ENUMS" INTERFACE_INCLUDE_DIRECTORIES "${_IMPORT_PREFIX}/include" INTERFACE_LINK_LIBRARIES "c10_hip;torch_cpu_library;hip::amdhip64;MIOpen;hiprtc::hiprtc;roc::hipblaslt;roc::hipblas;hip::hipfft;hip::hiprand;roc::hipsparse;roc::hipsolver" ) ``` HIPCUB dependency was not actually used; which is why it is removed here as the imported target had undesirable side effects. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136283 Approved by: https://github.com/jeffdaily, https://github.com/Skylion007, https://github.com/jithunnair-amd, https://github.com/atalman	2024-09-26 00:34:43 +00:00
Jeff Daily	ae9a4fa63c	[ROCm] enforce ROCM_VERSION >= 6.0 (#125646 ) Remove any code relying on ROCM_VERSION < 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646 Approved by: https://github.com/albanD, https://github.com/eqy	2024-05-12 18:01:28 +00:00
Jeff Daily	0e6eee3c89	[ROCm] TunableOp (#114894 ) Some operations, such as GEMMs, could be implemented using more than one library or more than one technique. For example, a GEMM could be implemented for CUDA or ROCm using either the blas or blasLt libraries. Further, ROCm's rocblas and hipblaslt libraries allow the user to query for all possible algorithms and then choose one. How does one know which implementation is the fastest and should be chosen? That's what TunableOp provides. See the README.md for additional details. TunableOp was ported from onnxruntime starting from commit `08dce54266`. The content was significantly modified and reorganized for use within PyTorch. The files copied and their approximate new names or source content location within aten/src/ATen/cuda/tunable include the following: - onnxruntime/core/framework/tunable.h -> Tunable.h - onnxruntime/core/framework/tuning_context.h -> Tunable.h - onnxruntime/core/framework/tuning_context_impl.h -> Tunable.cpp - onnxruntime/core/providers/rocm/tunable/gemm_common.h -> GemmCommon.h - onnxruntime/core/providers/rocm/tunable/gemm_hipblaslt.h -> GemmHipblaslt.h - onnxruntime/core/providers/rocm/tunable/gemm_rocblas.h -> GemmRocblas.h - onnxruntime/core/providers/rocm/tunable/gemm_tunable.cuh -> TunableGemm.h - onnxruntime/core/providers/rocm/tunable/rocm_tuning_context.cc -> Tunable.cpp - onnxruntime/core/providers/rocm/tunable/util.h -> StreamTimer.h - onnxruntime/core/providers/rocm/tunable/util.cc -> StreamTimer.cpp Pull Request resolved: https://github.com/pytorch/pytorch/pull/114894 Approved by: https://github.com/xw285cornell, https://github.com/jianyuh	2024-02-14 19:03:49 +00:00
Jeff Daily	2c9a90cde6	[ROCm] backward compatible type enums (#118137 ) Fixes builds of pytorch using unreleased ROCm packages that are missing type enums introduced in ROCm 6.0 release. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118137 Approved by: https://github.com/xw285cornell, https://github.com/anupambhatnagar	2024-01-26 08:40:13 +00:00
Jeff Daily	602abf6b55	[ROCm] more 6.0 changes (#115946 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115946 Approved by: https://github.com/pruthvistony, https://github.com/huydhn, https://github.com/malfet	2023-12-20 20:19:29 +00:00
Jeff Daily	8bff59e41d	[ROCm] add hipblaslt support (#114329 ) Disabled by default. Enable with env var DISABLE_ADDMM_HIP_LT=0. Tested on both ROCm 5.7 and 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114329 Approved by: https://github.com/malfet	2023-12-20 19:09:25 +00:00
PyTorch MergeBot	47908a608f	Revert "[ROCm] add hipblaslt support (#114329 )" This reverts commit `b062ea3803`. Reverted https://github.com/pytorch/pytorch/pull/114329 on behalf of https://github.com/jeanschmidt due to Reverting due to inconsistencies on internal diff ([comment](https://github.com/pytorch/pytorch/pull/114329#issuecomment-1861933267))	2023-12-19 01:04:58 +00:00
Jeff Daily	b062ea3803	[ROCm] add hipblaslt support (#114329 ) Disabled by default. Enable with env var DISABLE_ADDMM_HIP_LT=0. Tested on both ROCm 5.7 and 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114329 Approved by: https://github.com/malfet	2023-12-15 15:36:46 +00:00
PyTorch MergeBot	59f7355f86	Revert "[ROCm] add hipblaslt support (#114329 )" This reverts commit `bb2bb8cca1`. Reverted https://github.com/pytorch/pytorch/pull/114329 on behalf of https://github.com/atalman due to OSSCI oncall, trunk tests are failing ([comment](https://github.com/pytorch/pytorch/pull/114329#issuecomment-1857003155))	2023-12-14 23:53:30 +00:00
Jeff Daily	bb2bb8cca1	[ROCm] add hipblaslt support (#114329 ) Disabled by default. Enable with env var DISABLE_ADDMM_HIP_LT=0. Tested on both ROCm 5.7 and 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114329 Approved by: https://github.com/malfet	2023-12-14 21:41:22 +00:00
blorange-amd	6cdb6234d6	[ROCm] Supports ROCm6.0 reorganization and cleanup (#111486 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111486 Approved by: https://github.com/jithunnair-amd, https://github.com/pruthvistony, https://github.com/malfet	2023-11-16 18:37:12 +00:00
Jeff Daily	28c0b07d19	[ROCm] remove HCC references (#111975 ) - rename `__HIP_PLATFORM_HCC__` to `__HIP_PLATFORM_AMD__` - rename `HIP_HCC_FLAGS` to `HIP_CLANG_FLAGS` - rename `PYTORCH_HIP_HCC_LIBRARIES` to `PYTORCH_HIP_LIBRARIES` - workaround in tools/amd_build/build_amd.py until submodules are updated These symbols have had a long deprecation cycle and will finally be removed in ROCm 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111975 Approved by: https://github.com/ezyang, https://github.com/hongxiayang	2023-10-26 02:39:10 +00:00
Jeff Daily	5379b5f927	[ROCm] use hipblas instead of rocblas (#105881 ) - BatchLinearAlgebraLib.cpp is now split into one additional file - BatchLinearAlgebraLib.cpp uses only cusolver APIs - BatchLinearAlgebraLibBlas.cpp uses only cublas APIs - hipify operates at the file level and cannot mix cusolver and cublas APIs within the same file - cmake changes to link against hipblas instead of rocblas - hipify mappings changes to map cublas -> hipblas instead of rocblas Pull Request resolved: https://github.com/pytorch/pytorch/pull/105881 Approved by: https://github.com/albanD	2023-07-31 20:42:55 +00:00
Andres Lugo-Reyes	eaffd98880	Enable hipSOLVER in ROCm builds (#97370 ) Enables the hipSolver backend for ROCm builds -------------------------------------------------------------------------- - Minimum ROCm version requirement - 5.3 - Introduces new macro USE_LINALG_SOLVER the controls enablement of both cuSOLVER and hipSOLVER - Adds hipSOLVER API to hipification process - combines hipSOLVER and hipSPARSE mappings into single SPECIAL map that takes priority among normal mappings - Torch api to be moved to hipsolver backend (as opposed to magma) include: torch.svd(), torch.geqrf(), torch.orgqr(), torch.ormqr() - Will enable 100+ linalg unit tests for ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/97370 Approved by: https://github.com/malfet	2023-05-31 16:53:23 +00:00
Pruthvi Madugundu	08f125bcac	[ROCm] Remove usage of deprecated ROCm component header includes (#97620 ) - clang parameter 'amdgpu-target' changed to 'offload-arch' - HIP and MIOpen includes path updated for extensions Pull Request resolved: https://github.com/pytorch/pytorch/pull/97620 Approved by: https://github.com/ezyang, https://github.com/jithunnair-amd	2023-03-28 19:28:38 +00:00
Pruthvi Madugundu	fbd08fb358	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-04 04:43:05 +00:00
PyTorch MergeBot	0fa23663cc	Revert "Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 )" This reverts commit `1e2c4a6e0e`. Reverted https://github.com/pytorch/pytorch/pull/84190 on behalf of https://github.com/malfet due to Needs internal changes, has to be landed via co-dev	2022-11-02 18:13:37 +00:00
Pruthvi Madugundu	1e2c4a6e0e	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-02 17:41:57 +00:00
Pruthvi Madugundu	8473e69684	[ROCm] Fixes the kernel asserts API declaration mismatch error (#81790 ) This problem updates the the PR [#73040](https://github.com/pytorch/pytorch/pull/73040) The compilation error in pyTorch with ROCm is successful with these changes when `NDEBUG` is enabled. Solution: For HIP we keep `__device__ __assert_fail()` and for host side compilation we want to use the `__assert_fail()` from the glibc library. Tested the code by compiling with below steps ``` python3 tools/amd_build/build_amd.py python3 setup.py develop --cmake-only cmake -DHIP_HIPCC_FLAGS_RELEASE="-DNDEBUG" build cmake --build build ``` The UT test_fixed_cuda_assert_async is still skipped due performance overhead. cc @jithunnair-amd Pull Request resolved: https://github.com/pytorch/pytorch/pull/81790 Approved by: https://github.com/shintaro-iwasaki, https://github.com/jeffdaily, https://github.com/malfet	2022-08-16 19:22:31 +00:00
Dmitry Mikushin	e08026d4d4	Use miopen_LIBRARIES and rccl_LIBRARIES directly, when they are valid target (#80446 ) As of [this RCCL PR](https://github.com/ROCmSoftwarePlatform/rccl/pull/570), `${rccl_LIBRARIES}` refers to the actual RCCL library target, not just a symbolic "rccl" string. So starting from the next release, no special treatment of it would be required in PyTorch anymore. This patch checks whether `${RCCL_LIBRARIES}` and `${MIOpen_LIBRARIES}` are already valid, and if they are - is not trying to find them manually. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80446 Approved by: https://github.com/pruthvistony, https://github.com/malfet	2022-07-06 23:39:59 +00:00
Jeff Daily	64b543434d	[ROCm] update cmake package DIR paths (#77087 ) Fixes nightly libtorch builds. As of ROCm 5.1.x, all *.cmake files are under /opt/rocm/lib/cmake/package instead of /opt/rocm/package/lib/cmake. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77087 Approved by: https://github.com/seemethere	2022-05-10 19:06:51 +00:00
Shintaro Iwasaki	7dc2cfa249	[c10][rocm] fix __assert_fail() declaration mismatch error (#73040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73040 This patch fixes a compilation error in PyTorch with ROCm when `NDEBUG` is passed. ## Problem Forward declaration of `__host__ __device__ __assert_fail()` is used in `c10/macros/Macros.h` for HIP compilation when `NDEBUG` is set However, HIP has `__device__ __assert_fail()` in `hip/amd_detail/amd_device_functions.h`, causing a function type error. This issue does not appear in ROCm CI tests since it happens only when `NDEBUG` is passed. ## Solution [EDIT] After the discussion on GitHub, we chose to entirely disable `CUDA_KERNEL_ASSERT()` for ROCm. --- To solve this compilation error, this patch disables `CUDA_KERNEL_ASSERT()`, which uses `__assert_fail()` when 1. `c10/macros/Macros.h` is included for `.hip` (precisely speaking, `__HIP__` or `__HIP_ARCH__` is defined), and 2. `NDEBUG` is passed. Note that there's no impact on default compilation because, without a special compilation flag, those HIP files are compiled without `-NDEBUG`. And that's why this issue has not been found. ### Justification [1] We cannot declare one host-and-device function for two separate host and device functions. ``` __device__ int func() {return 0}; __host__ int func() {return 0}; // Compile error (hipcc) // __device__ __host__ int func(); ``` [2] Forward declaration of a correct `__device__` only `__assert_fail()` for `__HIP__` causes the following error: ``` pytorch/c10/util/TypeCast.h:135:7: error: reference to __device__ function '__assert_fail' in __host__ __device__ function ERROR_UNSUPPORTED_CAST ^ pytorch/c10/util/TypeCast.h:118:32: note: expanded from macro 'ERROR_UNSUPPORTED_CAST' #define ERROR_UNSUPPORTED_CAST CUDA_KERNEL_ASSERT(false); ^ pytorch/c10/macros/Macros.h:392:5: note: expanded from macro 'CUDA_KERNEL_ASSERT' __assert_fail( ``` [3] Maybe there's a way to properly define `__assert_fail()` for HIP + NDEBUG, but this might be too much. Please let me just disable it. ### Technical details Error ``` pytorch/c10/macros/Macros.h:368:5: error: __host__ __device__ function '__assert_fail' cannot overload __device__ function '__assert_fail' __assert_fail( ^ /opt/rocm/hip/include/hip/amd_detail/amd_device_functions.h:1173:6: note: previous declaration is here void __assert_fail(const char assertion, ``` CUDA definition (9.x) of `__assert_fail()` ``` #elif defined(__GNUC__) extern __host__ __device__ __cudart_builtin__ void __assert_fail( const char , const char , unsigned int, const char ) __THROW; ``` ROCm definition (the latest version) ``` // `2b59661f3e/include/hip/amd_detail/amd_device_functions.h (L1172-L1177)` extern "C" __device__ __attribute__((noinline)) __attribute__((weak)) void __assert_fail(const char assertion, const char file, unsigned int line, const char function); ``` Test Plan: CI + reproducer ``` python3 tools/amd_build/build_amd.py python3 setup.py develop --cmake-only cmake -DHIP_HIPCC_FLAGS_RELEASE="-DNDEBUG" build cmake --build build ``` Reviewed By: xw285cornell Differential Revision: D34310555 fbshipit-source-id: 7542288912590533ced3f20afd2e704b6551991b (cherry picked from commit 9e52196e36820abe36bf6427cabc7389d3ea6cb5)	2022-03-01 04:35:30 +00:00
Douglas Lehr	eb4e6ca30c	[ROCM] Add ROCM version api within cmake (#69481 ) Summary: In ROCm 5.0 and later the version of the ROCm platform can be obtained via an api call vs reading from a flat file. If the header file /opt/rocm/include/rocm_version.h exists, LoadHIP.cmake compiles source referencing the api and prints out the ROCM Versions. If the file does not exist, LoadHIP.cmake will revert to the previous approach of looking for the version-dev file. Fixes #{issue number} cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH Pull Request resolved: https://github.com/pytorch/pytorch/pull/69481 Reviewed By: seemethere, janeyx99 Differential Revision: D34153435 Pulled By: malfet fbshipit-source-id: f8c0650d27666d2a3cf47d812807798c47210b37 (cherry picked from commit `6cbb4f7a0c`)	2022-02-11 00:15:10 +00:00
Jithun Nair	8dfdc3df82	[ROCm] Refactor how to specify AMD gpu targets using PYTORCH_ROCM_ARCH (#61706 ) Summary: Remove all hardcoded AMD gfx targets PyTorch build and Magma build will use rocm_agent_enumerator as backup if PYTORCH_ROCM_ARCH env var is not defined PyTorch extensions will use same gfx targets as the PyTorch build, unless PYTORCH_ROCM_ARCH env var is defined torch.cuda.get_arch_list() now works for ROCm builds PyTorch CI dockers will continue to be built for gfx900 and gfx906 for now. PYTORCH_ROCM_ARCH env var can be a space or semicolon separated list of gfx archs eg. "gfx900 gfx906" or "gfx900;gfx906" cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH Pull Request resolved: https://github.com/pytorch/pytorch/pull/61706 Reviewed By: seemethere Differential Revision: D32735862 Pulled By: malfet fbshipit-source-id: 3170e445e738e3ce373203e1e4ae99c84e645d7d	2021-12-13 15:41:40 -08:00
Pruthvi Madugundu	ab7a472980	[ROCm] Update HIP_VERSION to TORCH_HIP_VERSION (#62786 ) Summary: - HIP_VERSION semantic versioning will change in ROCm4.3. The changes essentially remove the dependency on HIP_VERSION provided in the hip header to keep code compatible with older and newer versions of ROCm. - TORCH_HIP_VERSION is derived from HIP_VERSION_MAJOR and HIP_VERSION_MINOR Pull Request resolved: https://github.com/pytorch/pytorch/pull/62786 Reviewed By: bdhirsh Differential Revision: D30281682 Pulled By: seemethere fbshipit-source-id: e41e69fb9e13de5ddd1af99ba5bbdcbb7b64b673	2021-08-13 15:00:43 -07:00
Jeff Daily	24e27af683	[ROCm] enable kernel asserts (#49624 ) Summary: Addresses missing ROCm feature indicated in https://github.com/pytorch/pytorch/issues/38943. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49624 Reviewed By: agolynski Differential Revision: D28902459 Pulled By: malfet fbshipit-source-id: 29c9b552770241a0ec52cd057ea45efc4389d838	2021-06-07 13:43:07 -07:00
Jeff Daily	e1752ffa04	[reland][ROCm] use hiprtc precompiled header (#55965 ) Summary: Revert "Revert D27449031 (`2a7df657fe`): [pytorch][PR] [ROCm] use hiprtc precompiled header". Reland PR https://github.com/pytorch/pytorch/issues/54350. This reverts commit `204ac21bf1`. The original PR was reverted under suspicion that it was causing CI instability, but it was instead due to a hardware failure. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55965 Reviewed By: jbschlosser Differential Revision: D27755907 Pulled By: malfet fbshipit-source-id: 75bf0b9d888df3dee62f00a366b1123757e0474e	2021-04-15 15:47:56 -07:00
Jeff Daily	a128938a75	[ROCm] add MAGMA_HOME env var hint to cmake, centos-rocm Dockerfile (#54511 ) Summary: MAGMA_HOME was previously set for the ubuntu-rocm/Dockerfile. However, this missed centos builds as well as any builds that do not use the CI image environments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54511 Reviewed By: jbschlosser Differential Revision: D27755983 Pulled By: malfet fbshipit-source-id: 1ffd2cd100f4221c2bb64e6915fa3372ee1f6247	2021-04-15 01:06:44 -07:00
Alexander Golynski	204ac21bf1	Revert D27449031: [pytorch][PR] [ROCm] use hiprtc precompiled header Test Plan: revert-hammer Differential Revision: D27449031 (`2a7df657fe`) Original commit changeset: 81a8d7847a47 fbshipit-source-id: b7b970c8ea4110357fba3ad4d52a86fa5641d90c	2021-04-01 06:42:04 -07:00
Jeff Daily	2a7df657fe	[ROCm] use hiprtc precompiled header (#54350 ) Summary: HIP's runtime compiler (hiprtc) is adding support for precompiled HIP headers in the ROCm 4.2 release. Conditionally add support for this feature. Using this feature will improve the ROCm torch wheel user experience; users will no longer need to install HIP headers separately to use torch JIT features. The use of this feature is conditionalized on a new ROCM_VERSION macro. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54350 Reviewed By: H-Huang Differential Revision: D27449031 Pulled By: malfet fbshipit-source-id: 81a8d7847a47ce2bb253d1ea58740ef66ed154a3	2021-03-31 13:36:50 -07:00
Michael Melesse	2620bce42a	[ROCM] load only hipfft separately past rocm4.1 (#54349 ) Summary: This PR is a follow up to https://github.com/pytorch/pytorch/pull/53408. It only loads hipfft if the version is rocm 4.1 or after and stops loading rocfft. This was done to resolve some issues observed in our internal ci due to conflicts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54349 Reviewed By: ezyang Differential Revision: D27374252 Pulled By: ngimel fbshipit-source-id: 724e80df5011ea8fabd81739e18ae8a13d3a7ea0	2021-03-26 19:54:25 -07:00
Michael Melesse	4c1af249fb	[ROCM] load hipfft separately from rocfft (#53408 ) Summary: This PR makes changes to how hipfft is loaded in pytorch. hipfft is packaged in a separate library to rocfft following rocm 4.1. We check the rocm version and if it is past rocm 4.1 we load hipfft in addition to rocfft. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53408 Reviewed By: albanD Differential Revision: D26952702 Pulled By: malfet fbshipit-source-id: f42be304b587c060816e39d36f5c1a2cdc37bfab	2021-03-11 09:18:33 -08:00

1 2

85 Commits