pytorch/c10/cuda
cyy b51f66c195 [Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449)
This PR is the beginning of attempts to wrap thread-unsafe getenv and set_env functions inside a RW mutex.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119449
Approved by: https://github.com/albanD
2024-04-18 13:35:48 +00:00
..
impl Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
test Fix CUDA Bazel build to optionally include gmock after #104255 (#104308) 2023-06-29 07:15:06 +00:00
BUILD.bazel
build.bzl Support expandable_segments:True in fbcode for caching allocator 2023-05-02 11:12:39 -07:00
CMakeLists.txt [PyTorch CCA] Refactor caching allocator config code (#110123) 2023-10-04 14:58:23 +00:00
CUDAAlgorithm.h [c10] Use nested namespace in c10/cuda (#116464) 2023-12-27 23:14:00 +00:00
CUDAAllocatorConfig.cpp [Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449) 2024-04-18 13:35:48 +00:00
CUDAAllocatorConfig.h [Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449) 2024-04-18 13:35:48 +00:00
CUDACachingAllocator.cpp [Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449) 2024-04-18 13:35:48 +00:00
CUDACachingAllocator.h [Clang-tidy header][25/N] Fix clang-tidy warnings and enable clang-tidy on c10/cuda/*.{cpp,h} (#121952) 2024-03-16 00:09:54 +00:00
CUDADeviceAssertion.h [c10] Use nested namespace in c10/cuda (#116464) 2023-12-27 23:14:00 +00:00
CUDADeviceAssertionHost.cpp [Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449) 2024-04-18 13:35:48 +00:00
CUDADeviceAssertionHost.h [Clang-tidy header][15/N] Enable clang-tidy on headers in c10/cuda and c10/mobile (#116602) 2024-01-18 08:15:50 +00:00
CUDAException.cpp Enable nested namespace check in clang-tidy (#118506) 2024-01-31 00:32:35 +00:00
CUDAException.h [c10] Use nested namespace in c10/cuda (#116464) 2023-12-27 23:14:00 +00:00
CUDAFunctions.cpp Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
CUDAFunctions.h Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
CUDAGraphsC10Utils.h [Clang-tidy header][15/N] Enable clang-tidy on headers in c10/cuda and c10/mobile (#116602) 2024-01-18 08:15:50 +00:00
CUDAGuard.h [Clang-tidy header][24/N] Fix clang-tidy warnings on c10/cuda/*.{cpp,h} (#120781) 2024-03-15 05:03:22 +00:00
CUDAMacros.h Temporarily increased compile time limit of #GPUs to 120. (#121076) 2024-03-05 11:39:14 +00:00
CUDAMallocAsyncAllocator.cpp [Clang-tidy header][25/N] Fix clang-tidy warnings and enable clang-tidy on c10/cuda/*.{cpp,h} (#121952) 2024-03-16 00:09:54 +00:00
CUDAMathCompat.h [c10] Use nested namespace in c10/cuda (#116464) 2023-12-27 23:14:00 +00:00
CUDAMiscFunctions.cpp [Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449) 2024-04-18 13:35:48 +00:00
CUDAMiscFunctions.h [c10] Use nested namespace in c10/cuda (#116464) 2023-12-27 23:14:00 +00:00
CUDAStream.cpp Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
CUDAStream.h [Clang-tidy header][24/N] Fix clang-tidy warnings on c10/cuda/*.{cpp,h} (#120781) 2024-03-15 05:03:22 +00:00
driver_api.cpp Remove unused header inclusion (#119667) 2024-02-12 05:36:25 +00:00
driver_api.h [c10] Use nested namespace in c10/cuda (#116464) 2023-12-27 23:14:00 +00:00
README.md

c10/cuda is a core library with CUDA functionality. It is distinguished from c10 in that it links against the CUDA library, but like c10 it doesn't contain any kernels, and consists solely of core functionality that is generally useful when writing CUDA code; for example, C++ wrappers for the CUDA C API.

Important notes for developers. If you want to add files or functionality to this folder, TAKE NOTE. The code in this folder is very special, because on our AMD GPU build, we transpile it into c10/hip to provide a ROCm environment. Thus, if you write:

// c10/cuda/CUDAFoo.h
namespace c10 { namespace cuda {

void my_func();

}}

this will get transpiled into:

// c10/hip/HIPFoo.h
namespace c10 { namespace hip {

void my_func();

}}

Thus, if you add new functionality to c10, you must also update C10_MAPPINGS torch/utils/hipify/cuda_to_hip_mappings.py to transpile occurrences of cuda::my_func to hip::my_func. (At the moment, we do NOT have a catch all cuda:: to hip:: namespace conversion, as not all cuda namespaces are converted to hip::, even though c10's are.)

Transpilation inside this folder is controlled by CAFFE2_SPECIFIC_MAPPINGS (oddly enough.) C10_MAPPINGS apply to ALL source files.

If you add a new directory to this folder, you MUST update both c10/cuda/CMakeLists.txt and c10/hip/CMakeLists.txt