[Pytorch] Enable aarch64 convert autovec only on clang (#166739)

Summary: We've noted issues with modern GCC versions. Until further investigation is carried, we'll leave the code only enabled on clang

Test Plan: CI

Differential Revision: D85968395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/166739
Approved by: https://github.com/mcfi, https://github.com/Skylion007, https://github.com/robert-hardwick
This commit is contained in:
Nicolas De Carli 2025-10-31 20:22:33 +00:00 committed by PyTorch MergeBot
parent 70aeb49198
commit 8209a0506b

View File

@ -6,9 +6,9 @@ namespace at::vec {
inline namespace CPU_CAPABILITY { inline namespace CPU_CAPABILITY {
#if (defined(__aarch64__) && !defined(CPU_CAPABILITY_SVE256)) #if (defined(__aarch64__) && !defined(CPU_CAPABILITY_SVE256))
// Enable auto-vectorization for GCC-13+ and clang-17+ // Enable auto-vectorization for clang-17+
// GCC-12 has a bug: gcc.gnu.org/bugzilla/show_bug.cgi?id=117001 // GCC-12 has a bug: gcc.gnu.org/bugzilla/show_bug.cgi?id=117001
#if __GNUC__ > 12 || (defined(__clang__) && (__clang_major__ >= 17)) #if defined(__clang__) && (__clang_major__ >= 17)
template <typename from_type, typename to_type> template <typename from_type, typename to_type>
inline void convertImpl( inline void convertImpl(