pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Edward Yang 515238e0a5 Unify cudaGetDeviceCount implementations. (#18445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18445 ghimport-source-id: 30d018737bf6989bc68b7e3676f44e0ca6141fde Stack from [ghstack](https://github.com/ezyang/ghstack): * #18242 Test running a CUDA build on CPU machine. * #18445 Unify cudaGetDeviceCount implementations. I went about doing this by searching for calls to cudaGetDeviceCount, and then methodically replacing them with references to c10::cuda::device_count() or at::cuda::device_count(). There is a point to doing this: the various implementations wildly differed in their handling of what to do when cudaGetDeviceCount returns an error. The final standardized behavior is that all errors are swallowed and we return device count of zero. This indirectly fixes running CUDA builds on CPU, which was broken in #17847. I added 'noexcept' to the 'deviceCount' virtual method on DeviceGuardImpl. This is a BC-breaking change for anyone inheriting from DeviceGuardImpl but all you need to do is put 'noexcept' on your method and it is backwards compatible with older libtorch. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14612189 fbshipit-source-id: 3c8d186e3dd623c0e27625212c7ce30f75d943cb		2019-03-26 09:50:14 -07:00
..
cuda_cmake_macros.h.in	Add c10 cuda library. (#13900 )	2018-11-19 08:20:07 -08:00
CUDAGuardImpl.cpp	Move CUDAGuard, CUDAStream and CUDAGuardImpl to c10/cuda (#14248 )	2018-12-12 11:24:26 -08:00
CUDAGuardImpl.h	Unify cudaGetDeviceCount implementations. (#18445 )	2019-03-26 09:50:14 -07:00
CUDATest.cpp	Catch cudaError_t return val (nodiscard in rocm) (#16399 )	2019-02-11 13:18:36 -08:00
CUDATest.h	Add c10 cuda library. (#13900 )	2018-11-19 08:20:07 -08:00