pytorch/torch
Elias Ellison 0a9778a372 Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407)
>  capture_error_mode (str, optional): specifies the cudaStreamCaptureMode for the graph capture stream.
Can be "global", "thread_local" or "relaxed". During cuda graph capture, some actions, such as cudaMalloc,
 may be unsafe. "global" will error on actions in other threads, "thread_local" will only error for
 actions in the current thread, and "relaxed" will not error on these actions.

Inductor codegen is single-threaded, so it should be safe to enable "thread_local" for inductor's cuda graph capturing. We have seen errors when inductor cudagraphs has been used concurrently with data preprocessing in other threads.

Differential Revision: [D48656014](https://our.internmc.facebook.com/intern/diff/D48656014)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107407
Approved by: https://github.com/albanD, https://github.com/eqy
2023-08-25 01:44:26 +00:00
..
_awaits
_C Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407) 2023-08-25 01:44:26 +00:00
_C_flatbuffer
_custom_op Extend impl_backward to be usable with torch.library operators (#106817) 2023-08-14 14:33:46 +00:00
_decomp Fix Inplace tensor update on transpose (#104689) 2023-08-24 16:58:50 +00:00
_dispatch Fix some fake mode confusion between inner/outer fake mode in export (#106515) 2023-08-04 15:42:23 +00:00
_dynamo Improve unbacked symint error msg (#107806) 2023-08-25 01:07:09 +00:00
_export Improve unbacked symint error msg (#107806) 2023-08-25 01:07:09 +00:00
_functorch Fix aot sequence_nr to reset bwd flag (#107210) 2023-08-24 16:58:12 +00:00
_higher_order_ops [pytorch][Quant] Fix bias quant bug (#107810) 2023-08-24 23:44:19 +00:00
_inductor Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407) 2023-08-25 01:44:26 +00:00
_lazy
_logging Add frame/recompile counter to all log messages in tracing context (#107530) 2023-08-21 13:02:12 +00:00
_numpy torch._numpy: keep f16 CUDA tensors in f16 where possible (#107768) 2023-08-23 18:35:47 +00:00
_prims [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
_prims_common Remove dynamo+nvfuser (#105789) 2023-08-08 22:29:32 +00:00
_refs Fix Inplace tensor update on transpose (#104689) 2023-08-24 16:58:50 +00:00
_subclasses Fix Inplace tensor update on transpose (#104689) 2023-08-24 16:58:50 +00:00
amp Apply UFMT to low traffic torch modules (#106249) 2023-07-29 23:37:30 +00:00
ao [pytorch][Quant] Fix bias quant bug (#107810) 2023-08-24 23:44:19 +00:00
autograd [profiler] move _enable_dynamo_cache_lookup_profiler (#107720) 2023-08-23 23:41:35 +00:00
backends [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
compiler
contrib [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436) 2023-07-21 07:38:46 +00:00
cpu Apply UFMT to low traffic torch modules (#106249) 2023-07-29 23:37:30 +00:00
csrc Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407) 2023-08-25 01:44:26 +00:00
cuda Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407) 2023-08-25 01:44:26 +00:00
distributed [FSDP] verify backward_prefetch works correctly with unit test (#107058) 2023-08-25 01:12:43 +00:00
distributions [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
export Expose ExportedProgram and related classes (#107852) 2023-08-25 00:07:00 +00:00
fft
func [pt2] support vmap (#101707) 2023-08-09 03:39:33 +00:00
futures
fx Fix aot sequence_nr to reset bwd flag (#107210) 2023-08-24 16:58:12 +00:00
jit [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
legacy
lib Revert "Remove some unnecessary <iostream> includes from headers (#106914)" 2023-08-22 17:16:48 +00:00
linalg [CUDA][Linalg} Patch crash of linalg.eigh when input matrix is ill-conditioned, in some cusolver version (#107082) 2023-08-16 21:15:15 +00:00
masked [BE]: Update Ruff to 0.0.280 (#105724) 2023-07-22 23:03:34 +00:00
monitor
mps [MPS] Introduce torch.mps.Event() APIs (#102121) 2023-08-08 03:45:45 +00:00
multiprocessing Apply UFMT to low traffic torch modules (#106249) 2023-07-29 23:37:30 +00:00
nested
nn Fix the document of torch.nn.functional.conv2d (#107851) 2023-08-24 18:02:03 +00:00
onnx [ONNX] Cap opset version at 17 for torch.onnx.export (#107829) 2023-08-24 07:21:10 +00:00
optim [optim] Make casting to match params a hook (#106725) 2023-08-23 22:25:33 +00:00
package [BE]: Update Ruff to 0.0.280 (#105724) 2023-07-22 23:03:34 +00:00
profiler [profiler] move _enable_dynamo_cache_lookup_profiler (#107720) 2023-08-23 23:41:35 +00:00
quantization Apply UFMT to low traffic torch modules (#106249) 2023-07-29 23:37:30 +00:00
signal [BE] Enable ruff's UP rules and autoformat optim/ (#105426) 2023-07-18 21:07:43 +00:00
sparse Revert "[core][pruning][feature] cuSPARSELt kernels and ops (#102133)" 2023-08-09 16:03:14 +00:00
special
testing Enhance fakepg: send and recv (#107625) 2023-08-24 22:06:34 +00:00
utils Expose cudaStreamCaptureMode in CUDA Graphs, use local setting in inductor (#107407) 2023-08-25 01:44:26 +00:00
__config__.py
__future__.py
__init__.py [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
_appdirs.py [BE] f-stringify torch/ and scripts (#105538) 2023-07-21 19:35:24 +00:00
_classes.py
_compile.py
_custom_ops.py Extend impl_backward to be usable with torch.library operators (#106817) 2023-08-14 14:33:46 +00:00
_deploy.py
_guards.py [dynamo] Store originating source in the Guard object (#107634) 2023-08-22 02:16:31 +00:00
_jit_internal.py
_linalg_utils.py [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436) 2023-07-21 07:38:46 +00:00
_lobpcg.py [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
_lowrank.py [BE] f-stringify torch/ and scripts (#105538) 2023-07-21 19:35:24 +00:00
_meta_registrations.py [CPU] Enable fused_attention pattern matcher (#107128) 2023-08-20 08:53:24 +00:00
_namedtensor_internals.py [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
_ops.py [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
_python_dispatcher.py [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436) 2023-07-21 07:38:46 +00:00
_sources.py
_storage_docs.py
_tensor_docs.py Modify signature for tensor.tile in doc (#106295) 2023-08-01 19:51:52 +00:00
_tensor_str.py [BE] f-stringify torch/ and scripts (#105538) 2023-07-21 19:35:24 +00:00
_tensor.py [PyTorch][Tensor] Introduce tensor.dim_order (#106835) 2023-08-25 00:06:03 +00:00
_torch_docs.py [PyTorch][Tensor] Introduce tensor.dim_order (#106835) 2023-08-25 00:06:03 +00:00
_utils_internal.py [Dynamo] Improve PT2 fbcode logging observability (#106932) 2023-08-11 20:46:04 +00:00
_utils.py Back out "Reland "Make adding buffers more like adding parameters (#104069)" (#106224)" (#106743) 2023-08-08 15:27:34 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt Revert "[Reland] Upgrade NVTX to NVTX3 (#97582)" 2023-08-15 20:55:12 +00:00
custom_class_detail.h
custom_class.h Revert "Remove some unnecessary <iostream> includes from headers (#106914)" 2023-08-22 17:16:48 +00:00
extension.h reduce header file to boost cpp_wrapper build. (#107585) 2023-08-22 11:58:47 +00:00
functional.py fix torch.norm for custom device (#106198) 2023-08-02 06:25:52 +00:00
hub.py Default permissions for torch.hub downloads (#82869) 2023-08-24 15:48:24 +00:00
library.h
library.py Enable registering fallthroughs to (op, dk) from torch.library (#106086) 2023-07-28 19:37:59 +00:00
overrides.py [PyTorch][Tensor] Introduce tensor.dim_order (#106835) 2023-08-25 00:06:03 +00:00
py.typed
quasirandom.py [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
random.py
README.txt
return_types.py
script.h
serialization.py [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
storage.py [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436) 2023-07-21 07:38:46 +00:00
torch_version.py [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436) 2023-07-21 07:38:46 +00:00
types.py [BE]: Apply PYI autofixes to various types (#107521) 2023-08-20 02:42:21 +00:00
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.