pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Elias Ellison 0a9778a372 Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407 ) > capture_error_mode (str, optional): specifies the cudaStreamCaptureMode for the graph capture stream. Can be "global", "thread_local" or "relaxed". During cuda graph capture, some actions, such as cudaMalloc, may be unsafe. "global" will error on actions in other threads, "thread_local" will only error for actions in the current thread, and "relaxed" will not error on these actions. Inductor codegen is single-threaded, so it should be safe to enable "thread_local" for inductor's cuda graph capturing. We have seen errors when inductor cudagraphs has been used concurrently with data preprocessing in other threads. Differential Revision: [D48656014](https://our.internmc.facebook.com/intern/diff/D48656014) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107407 Approved by: https://github.com/albanD, https://github.com/eqy		2023-08-25 01:44:26 +00:00
..
shared	Revert "[Reland] Upgrade NVTX to NVTX3 (#97582 )"	2023-08-15 20:55:12 +00:00
comm.cpp	[BE] Use nested namespaces in `torch/csrc/cuda` (#106928 )	2023-08-10 03:56:09 +00:00
comm.h	[BE] Use nested namespaces in `torch/csrc/cuda` (#106928 )	2023-08-10 03:56:09 +00:00
CUDAPluggableAllocator.cpp	Introduce memory stacks for free (#106758 )	2023-08-14 20:38:15 +00:00
CUDAPluggableAllocator.h	Introduce memory stacks for free (#106758 )	2023-08-14 20:38:15 +00:00
device_set.h
Event.cpp
Event.h
Graph.cpp	Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407 )	2023-08-25 01:44:26 +00:00
memory_snapshot.cpp	[memory snapshot] add 'address' key to block (#107171 )	2023-08-23 18:57:24 +00:00
memory_snapshot.h	Introduce memory stacks for free (#106758 )	2023-08-14 20:38:15 +00:00
Module.cpp	[1/N] fix clang-tidy warnings in torch/csrc (#107648 )	2023-08-25 00:30:09 +00:00
Module.h
nccl.cpp	[BE] Use nested namespaces in `torch/csrc/cuda` (#106928 )	2023-08-10 03:56:09 +00:00
nccl.h	[BE] Use nested namespaces in `torch/csrc/cuda` (#106928 )	2023-08-10 03:56:09 +00:00
python_comm.cpp	[BE] Use nested namespaces in `torch/csrc/cuda` (#106928 )	2023-08-10 03:56:09 +00:00
python_comm.h	[BE] Use nested namespaces in `torch/csrc/cuda` (#106928 )	2023-08-10 03:56:09 +00:00
python_nccl.cpp
python_nccl.h
Stream.cpp	add additional stream priority for cuda streams (#101956 )	2023-05-27 02:36:16 +00:00
Stream.h
Tensor.cpp
THCP.h
utils.cpp