pytorch/torch/csrc/cuda
Elias Ellison 0a9778a372 Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407)
>  capture_error_mode (str, optional): specifies the cudaStreamCaptureMode for the graph capture stream.
Can be "global", "thread_local" or "relaxed". During cuda graph capture, some actions, such as cudaMalloc,
 may be unsafe. "global" will error on actions in other threads, "thread_local" will only error for
 actions in the current thread, and "relaxed" will not error on these actions.

Inductor codegen is single-threaded, so it should be safe to enable "thread_local" for inductor's cuda graph capturing. We have seen errors when inductor cudagraphs has been used concurrently with data preprocessing in other threads.

Differential Revision: [D48656014](https://our.internmc.facebook.com/intern/diff/D48656014)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107407
Approved by: https://github.com/albanD, https://github.com/eqy
2023-08-25 01:44:26 +00:00
..
shared Revert "[Reland] Upgrade NVTX to NVTX3 (#97582)" 2023-08-15 20:55:12 +00:00
comm.cpp [BE] Use nested namespaces in torch/csrc/cuda (#106928) 2023-08-10 03:56:09 +00:00
comm.h [BE] Use nested namespaces in torch/csrc/cuda (#106928) 2023-08-10 03:56:09 +00:00
CUDAPluggableAllocator.cpp Introduce memory stacks for free (#106758) 2023-08-14 20:38:15 +00:00
CUDAPluggableAllocator.h Introduce memory stacks for free (#106758) 2023-08-14 20:38:15 +00:00
device_set.h
Event.cpp
Event.h
Graph.cpp Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407) 2023-08-25 01:44:26 +00:00
memory_snapshot.cpp [memory snapshot] add 'address' key to block (#107171) 2023-08-23 18:57:24 +00:00
memory_snapshot.h Introduce memory stacks for free (#106758) 2023-08-14 20:38:15 +00:00
Module.cpp [1/N] fix clang-tidy warnings in torch/csrc (#107648) 2023-08-25 00:30:09 +00:00
Module.h
nccl.cpp [BE] Use nested namespaces in torch/csrc/cuda (#106928) 2023-08-10 03:56:09 +00:00
nccl.h [BE] Use nested namespaces in torch/csrc/cuda (#106928) 2023-08-10 03:56:09 +00:00
python_comm.cpp [BE] Use nested namespaces in torch/csrc/cuda (#106928) 2023-08-10 03:56:09 +00:00
python_comm.h [BE] Use nested namespaces in torch/csrc/cuda (#106928) 2023-08-10 03:56:09 +00:00
python_nccl.cpp
python_nccl.h
Stream.cpp add additional stream priority for cuda streams (#101956) 2023-05-27 02:36:16 +00:00
Stream.h
Tensor.cpp
THCP.h
utils.cpp