mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18445 ghimport-source-id: 30d018737bf6989bc68b7e3676f44e0ca6141fde Stack from [ghstack](https://github.com/ezyang/ghstack): * #18242 Test running a CUDA build on CPU machine. * **#18445 Unify cudaGetDeviceCount implementations.** I went about doing this by searching for calls to cudaGetDeviceCount, and then methodically replacing them with references to c10::cuda::device_count() or at::cuda::device_count(). There is a point to doing this: the various implementations wildly differed in their handling of what to do when cudaGetDeviceCount returns an error. The final standardized behavior is that **all errors are swallowed** and we return device count of zero. This indirectly fixes running CUDA builds on CPU, which was broken in #17847. I added 'noexcept' to the 'deviceCount' virtual method on DeviceGuardImpl. This is a BC-breaking change for anyone inheriting from DeviceGuardImpl but all you need to do is put 'noexcept' on your method and it is backwards compatible with older libtorch. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14612189 fbshipit-source-id: 3c8d186e3dd623c0e27625212c7ce30f75d943cb |
||
|---|---|---|
| .. | ||
| cuda_cmake_macros.h.in | ||
| CUDAGuardImpl.cpp | ||
| CUDAGuardImpl.h | ||
| CUDATest.cpp | ||
| CUDATest.h | ||