pytorch/docs/source
Frank Lin 0c0e056a9e [CUDA] Reuse blocks with record_stream during CUDA Graph capture in the CUDACachingAllocator (#158352)
## Introduction

During CUDA Graph capture, the CUDA caching allocator currently defers reclaiming blocks until capture ends. This is because CUDA forbids querying events recorded during capture (the CUDA operation is not executed during the capture stage), so the allocator cannot use its normal event-based logic. However, capture records an DAG (we call it **capturing graph**) of work. We can use the capturing graph to determine when a block’s old lifetime is fully before future work, and safely reuse it within the same capture.

This PR adds an experimental flag `graph_capture_record_stream_reuse: True|False (default: False)`. When enabled, the allocator inserts lightweight free markers and uses capture ordering to decide if a freed block is safe to reuse during capture. If the proof cannot be established, we fall back to the existing post-capture path.

## Terms

* **Free marker**: A capture-legal no-op (created with `cudaGraphAddEmptyNode`) inserted after the last captured use of the block on each stream that used it.
* **Terminal**: The set of the lastest operations of the stream (or the capturing graph). Any newly captured op on that stream will attach after all nodes in this set. For a stream currently capturing, it is the set of nodes returned in `dependencies_out` by `cudaStreamGetCaptureInfo`.

## When can we reuse a block during capture?

### Strong Rule (Graph-Wide Safety)

This rule provides a universal guarantee that a block is safe for reuse by any stream in the graph.

> A block is safe to reuse if every free marker is a predecessor of every terminal of all active streams in the graph.

Why it's safe:

This rule establishes a strict global ordering. Since any new operation on any stream must be appended after that stream's terminals, this condition guarantees that the block's new lifetime begins only after its old lifetime has completely ended everywhere. This prevents lifetime overlaps when the graph is replayed, ensuring correctness.

### Per-stream Rule (A Practical Optimization)

The strong rule, while safe, is often unnecessarily restrictive. The `DeviceCachingAllocator` introduces a crucial constraint that allows for a simpler check.

In `DeviceCachingAllocator`, `get_free_block` only returns blocks whose `block->stream == p.stream()`. In other words, we never reuse a block on a stream different from the allocation stream. This means we don't need to verify safety across the entire graph. We only need to confirm that the block is safe to reuse from the perspective of its own allocation stream.

> Reuse a block for allocations on stream S if every free marker is a predecessor of every node in the terminal set of S.

In short, a block is considered **reusable** on stream S as long as all marker marking it "free" are guaranteed to complete before any new work that might need it on stream S begins.

## Implementation

* On `free(block)` during capture
  * For each stream in `block->stream_uses` and the allocation stream, insert a free marker (empty node) and make it that stream’s tail.
  * If we cannot place markers for all such streams (for example, a stream is not in capture), defer to the post-capture path.
  * Otherwise, store the marker handles and keep the block in the capture-private structures.
* On `allocate(stream)` during capture (attempt per-stream reclaim)
  * Query the allocation stream S’s terminal via `cudaStreamGetCaptureInfo`.
  * For each deferred block, check whether it is allocated on this stream, and each of its free markers is a predecessor of the terminal.
    * If yes, hand the block to S for immediate reuse within the same capture.
    * If no, keep it deferred; it will be reconsidered as capture progresses and S’s terminal advances.
* On capture end
  * Any still-deferred blocks follow the existing post-capture reclamation (event insertion/polling). External behavior remains unchanged if we cannot prove safety during capture.

## Examples (2 streams)

<img width="641" height="801" alt="pytorch-remove-cudagraph-defer-reclaiming (6)" src="https://github.com/user-attachments/assets/41adc835-d448-483b-99ba-b4341cb7d2a2" />

* Case 0 — Unsafe
The two frees are not ordered with respect to each other. For stream 1, the other stream’s free marker does not precede this stream’s terminal, so the per-stream condition fails.
Counterexample intuition for the unsafe setups: imagine `f2(x)` runs for a long time. If DeviceCachingAllocator reused block `x` on a stream whose terminal is not ordered after the free markers, the new lifetime could overlap the old one on replay, risking use-after-free or data corruption. The per-stream rule prevents exactly this.
* Case 1 — Reusable on stream 1
Stream 1’s terminal is after both frees, so every free marker precedes stream 1’s terminal. The block is reusable for allocations on stream 1.
* Case 2 — Not reusable on stream 2, but this cannot occur in `DeviceCachingAllocator`
This depicts reusing the block on stream 2 while stream 1’s free is not yet ordered before stream 2’s terminal. Though the block is not safe to reuse on stream 2, DeviceCachingAllocator will not choose that block for stream 2 anyway: `get_free_block` rejects blocks whose `stream != p.stream()`. So this case is unreachable.
* Case 3 — Safe (strong rule holds)
In this scenario, the terminal nodes of all streams are positioned after the block's free markers, satisfying the strong rule. This guarantees the block is safe for reuse by any stream in the capturing graph. However, since `DeviceCachingAllocator ` only reuses a block on its original allocation stream, verifying this strong condition is unnecessary. We only need to ensure the per-stream rule is met for the specific stream requesting the block.
* Case 4 — Freeing after a join
See the note below.

## Edge Case: Freeing after a join

Our current dependency tracking has a limitation in scenarios where a block is freed after a stream join, see @galv's [comments here](https://github.com/pytorch/pytorch/pull/158352#pullrequestreview-3112565198)).

In the case 4, we have a missed opportunity. Because the block's usage is not explicitly marked, we cannot determine that the block's actual last use may have occurred much earlier, long before the join. Then, we must wait for the subsequent join before the block can be reused.

## Thanks
Thanks to @galv for his great idea around graph parsing and empty nodes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158352
Approved by: https://github.com/ngimel, https://github.com/eqy

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-09-04 17:21:26 +00:00
..
_static [doc] AOTI debugging guide (#160430) 2025-08-14 23:42:17 +00:00
_templates Migrate to new theme (#149331) 2025-04-16 21:35:19 +00:00
accelerator [OpenReg] Migrate Accelerator Document from source/notes into source/accelerator (#161845) 2025-09-03 03:12:18 +00:00
community [doc] add weifengpy to torch distributed pocs (#158989) 2025-07-24 04:42:33 +00:00
compile [dynamo] change error_on_graph_break/fullgraph semantics (#161747) 2025-09-04 17:10:17 +00:00
elastic Support NUMA Binding for Callable Entrypoints (#160163) 2025-08-12 20:08:49 +00:00
export [export] Add runnable code to export docs (#158506) 2025-07-17 20:15:22 +00:00
notes [CUDA] Reuse blocks with record_stream during CUDA Graph capture in the CUDACachingAllocator (#158352) 2025-09-04 17:21:26 +00:00
rpc Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
scripts [ONNX] Filter out torchscript sentences (#158850) 2025-07-24 20:59:06 +00:00
user_guide [OpenReg] Migrate Accelerator Document from source/notes into source/accelerator (#161845) 2025-09-03 03:12:18 +00:00
accelerator.md Add unified memory APIs for torch.accelerator (#152932) 2025-08-08 17:41:22 +00:00
amp.md [Docs] Convert to markdown: accelerator.rst, amp.rst, autograd.rst, backends.rst, benchmark_utils.rst (#155762) 2025-06-12 02:55:06 +00:00
autograd.md [Docs] Convert to markdown: accelerator.rst, amp.rst, autograd.rst, backends.rst, benchmark_utils.rst (#155762) 2025-06-12 02:55:06 +00:00
backends.md Revert "[ROCm] SDPA fix mem fault when dropout is enabled (#154864)" 2025-08-26 20:03:59 +00:00
benchmark_utils.md [Docs] Convert to markdown: accelerator.rst, amp.rst, autograd.rst, backends.rst, benchmark_utils.rst (#155762) 2025-06-12 02:55:06 +00:00
bottleneck.rst
checkpoint.md Convert to markdown: checkpoint.rst (#156009) 2025-06-16 17:48:23 +00:00
complex_numbers.md Convert complex_numbers.rst to markdown (#156039) 2025-06-16 17:24:37 +00:00
cond.md [Docs] Convert to markdown cond.rst, config_mod.rst (#155653) 2025-06-13 20:58:57 +00:00
conf.py Revert "Always build USE_DISTRIBUTED. (#160449)" 2025-09-04 15:25:07 +00:00
config_mod.md [Docs] Convert to markdown cond.rst, config_mod.rst (#155653) 2025-06-13 20:58:57 +00:00
cpp_extension.rst
cpp_index.rst [3/n] Remove references to TorchScript in PyTorch docs (#158315) 2025-07-15 21:14:18 +00:00
cpu.rst
cuda_environment_variables.rst
cuda._sanitizer.rst
cuda.md Revert "[WIP] Merge Test (#160998)" 2025-08-19 20:30:39 +00:00
cuda.tunable.md Fix #155016 for Docathon - convert rst to markdown (#155198) 2025-06-13 20:24:34 +00:00
cudnn_persistent_rnn.rst
cudnn_rnn_determinism.rst Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
data.md Fix #155016 for Docathon - convert rst to markdown (#155198) 2025-06-13 20:24:34 +00:00
ddp_comm_hooks.md DOC: Convert to markdown: ddp_comm_hooks.rst, debugging_environment_variables.rst, deploy.rst, deterministic.rst, distributed.algorithms.join.rst (#155298) 2025-06-06 22:44:50 +00:00
debugging_environment_variables.md DOC: Convert to markdown: ddp_comm_hooks.rst, debugging_environment_variables.rst, deploy.rst, deterministic.rst, distributed.algorithms.join.rst (#155298) 2025-06-06 22:44:50 +00:00
deterministic.md DOC: Convert to markdown: ddp_comm_hooks.rst, debugging_environment_variables.rst, deploy.rst, deterministic.rst, distributed.algorithms.join.rst (#155298) 2025-06-06 22:44:50 +00:00
distributed._dist2.md Add a title to distributed._dist2.md (#159385) 2025-07-30 04:09:41 +00:00
distributed.algorithms.join.md DOC: Convert to markdown: ddp_comm_hooks.rst, debugging_environment_variables.rst, deploy.rst, deterministic.rst, distributed.algorithms.join.rst (#155298) 2025-06-06 22:44:50 +00:00
distributed.checkpoint.md [DCP][HuggingFace] Add Support for dequantization of SafeTensors checkpoints (#160682) 2025-09-04 01:09:53 +00:00
distributed.elastic.md NUMA binding integration with elastic agent and torchrun (#149334) 2025-07-25 21:19:49 +00:00
distributed.fsdp.fully_shard.md [FSDP2] explain user contract for fully_shard (#156070) 2025-06-17 10:03:19 +00:00
distributed.md [DCP][HuggingFace] Add Support for dequantization of SafeTensors checkpoints (#160682) 2025-09-04 01:09:53 +00:00
distributed.optim.md Fix #155018 (convert distributed rst to markdown) (#155528) 2025-06-16 20:46:09 +00:00
distributed.pipelining.md [PP] Add DualPipeV schedule (#159591) 2025-08-14 14:58:35 +00:00
distributed.tensor.md [DTensor] Make default RNG semantics match user-passed generator (#160482) 2025-08-25 04:21:19 +00:00
distributed.tensor.parallel.md Convert to markdown: distributed.tensor.parallel.rst, distributed.tensor.rst, distributions.rst, dlpack.rst (#155297) 2025-06-13 22:08:37 +00:00
distributions.md Convert to markdown: distributed.tensor.parallel.rst, distributed.tensor.rst, distributions.rst, dlpack.rst (#155297) 2025-06-13 22:08:37 +00:00
dlpack.md Convert to markdown: distributed.tensor.parallel.rst, distributed.tensor.rst, distributions.rst, dlpack.rst (#155297) 2025-06-13 22:08:37 +00:00
docutils.conf
export.md [export] Add runnable code to export docs (#158506) 2025-07-17 20:15:22 +00:00
fft.md Convert to .md: draft_export.rst, export.ir_spec.rst, fft.rst (#155567) 2025-06-13 05:19:43 +00:00
fsdp.md Convert rst files to md (#155369) 2025-06-11 23:00:52 +00:00
func.api.md Convert rst files to md (#155369) 2025-06-11 23:00:52 +00:00
func.batch_norm.md Convert rst files to md (#155369) 2025-06-11 23:00:52 +00:00
func.md Convert rst files to md (#155369) 2025-06-11 23:00:52 +00:00
func.migrating.md Convert rst files to md (#155369) 2025-06-11 23:00:52 +00:00
func.ux_limitations.md Fix #155022 rst to markdown conversion (#155540) 2025-06-12 00:21:22 +00:00
func.whirlwind_tour.md Fix #155022 rst to markdown conversion (#155540) 2025-06-12 00:21:22 +00:00
future_mod.md Fix #155022 rst to markdown conversion (#155540) 2025-06-12 00:21:22 +00:00
futures.md Fix #155022 rst to markdown conversion (#155540) 2025-06-12 00:21:22 +00:00
fx.experimental.md Fix #155022 rst to markdown conversion (#155540) 2025-06-12 00:21:22 +00:00
fx.md [3/n] Remove references to TorchScript in PyTorch docs (#158315) 2025-07-15 21:14:18 +00:00
hub.md Convert hub.rst to hub.md (#155483) 2025-06-13 04:39:55 +00:00
index.md Add placeholder for the User Guide (#159379) 2025-08-13 14:56:04 +00:00
jit_builtin_functions.rst [4/n] Remove references to TorchScript in PyTorch docs (#158317) 2025-07-16 20:01:34 +00:00
jit_language_reference_v2.md [1/n] Remove references to TorchScript in PyTorch docs (#158305) 2025-07-15 20:16:53 +00:00
jit_language_reference.md [2/n] Remove references to TorchScript in PyTorch docs (#158306) 2025-07-15 20:57:23 +00:00
jit_python_reference.md [3/n] Remove references to TorchScript in PyTorch docs (#158315) 2025-07-15 21:14:18 +00:00
jit_unsupported.md [4/n] Remove references to TorchScript in PyTorch docs (#158317) 2025-07-16 20:01:34 +00:00
jit_utils.md Convert to markdown: jit_python_reference.rst, jit_unsupported.rst, jit_utils.rst, library.rst (#155404) 2025-06-26 21:09:46 +00:00
jit.rst [4/n] Remove references to TorchScript in PyTorch docs (#158317) 2025-07-16 20:01:34 +00:00
library.md Add utility to get computed kernel in torch.library (#158393) 2025-08-13 21:00:59 +00:00
linalg.md [Docs] Convert to markdown to fix 155025 (#155789) 2025-06-17 15:08:14 +00:00
logging.md [Docs] Convert to markdown to fix 155025 (#155789) 2025-06-17 15:08:14 +00:00
masked.md [Docs] Convert to markdown to fix 155025 (#155789) 2025-06-17 15:08:14 +00:00
math-quantizer-equation.png
meta.md [Docs] Convert to markdown to fix 155025 (#155789) 2025-06-17 15:08:14 +00:00
miscellaneous_environment_variables.md [Docs] Convert to markdown to fix 155025 (#155789) 2025-06-17 15:08:14 +00:00
mobile_optimizer.md DOC: Convert to markdown: mobile_optimizer.rst, model_zoo.rst, module_tracker.rst, monitor.rst, mps_environment_variables.rst (#155702) 2025-06-11 22:16:04 +00:00
model_zoo.md DOC: Convert to markdown: mobile_optimizer.rst, model_zoo.rst, module_tracker.rst, monitor.rst, mps_environment_variables.rst (#155702) 2025-06-11 22:16:04 +00:00
module_tracker.md DOC: Convert to markdown: mobile_optimizer.rst, model_zoo.rst, module_tracker.rst, monitor.rst, mps_environment_variables.rst (#155702) 2025-06-11 22:16:04 +00:00
monitor.md DOC: Convert to markdown: mobile_optimizer.rst, model_zoo.rst, module_tracker.rst, monitor.rst, mps_environment_variables.rst (#155702) 2025-06-11 22:16:04 +00:00
mps_environment_variables.md DOC: Convert to markdown: mobile_optimizer.rst, model_zoo.rst, module_tracker.rst, monitor.rst, mps_environment_variables.rst (#155702) 2025-06-11 22:16:04 +00:00
mps.md Fix/issue #155027 (#155252) 2025-06-08 21:17:31 +00:00
mtia.md Fix/issue #155027 (#155252) 2025-06-08 21:17:31 +00:00
mtia.memory.md [Re-land][Inductor] Support native Inductor as backend for MTIA (#159211) 2025-07-29 17:03:24 +00:00
multiprocessing.md Fix/issue #155027 (#155252) 2025-06-08 21:17:31 +00:00
name_inference.md Fix/issue #155027 (#155252) 2025-06-08 21:17:31 +00:00
named_tensor.md Convert to markdown: named_tensor.rst, nested.rst, nn.attention.bias.rst, nn.attention.experimental.rst, nn.attention.flex_attention.rst #155028 (#155696) 2025-06-14 03:32:00 +00:00
nested.md Convert to markdown: named_tensor.rst, nested.rst, nn.attention.bias.rst, nn.attention.experimental.rst, nn.attention.flex_attention.rst #155028 (#155696) 2025-06-14 03:32:00 +00:00
nn.aliases.md [BE] Use .md instead of .rst for nn.aliases doc (#158666) 2025-07-25 22:03:55 +00:00
nn.attention.bias.md Convert to markdown: named_tensor.rst, nested.rst, nn.attention.bias.rst, nn.attention.experimental.rst, nn.attention.flex_attention.rst #155028 (#155696) 2025-06-14 03:32:00 +00:00
nn.attention.experimental.md Convert to markdown: named_tensor.rst, nested.rst, nn.attention.bias.rst, nn.attention.experimental.rst, nn.attention.flex_attention.rst #155028 (#155696) 2025-06-14 03:32:00 +00:00
nn.attention.flex_attention.md Add kernel options to flex docs (#158875) 2025-07-23 19:05:19 +00:00
nn.attention.rst
nn.functional.rst
nn.init.rst
nn.rst [BE] More torch.nn docs coverage test (except for torch.nn.parallel) (#158654) 2025-07-25 22:03:55 +00:00
notes.md Migrate to new theme (#149331) 2025-04-16 21:35:19 +00:00
onnx_export.md [ONNX] Remove enable_fake_mode and exporter_legacy (#161222) 2025-08-22 22:15:27 +00:00
onnx_ops.md [ONNX] Implement Attention-23 (#156431) 2025-06-20 23:54:57 +00:00
onnx_verification.md [ONNX] Refactor torchscript based exporter (#161323) 2025-09-02 16:10:30 +00:00
onnx.md [ONNX] Refactor torchscript based exporter (#161323) 2025-09-02 16:10:30 +00:00
optim.aliases.md Document the rest of the specific optimizer module APIs (#158669) 2025-07-19 07:27:15 +00:00
optim.md [muon] Introduce Muon optimizer to PyTorch (#160213) 2025-08-24 08:03:04 +00:00
package.md [3/n] Remove references to TorchScript in PyTorch docs (#158315) 2025-07-15 21:14:18 +00:00
profiler.md Convert rst to markdown - profiler.rst #155031 (#155559) 2025-06-13 05:02:54 +00:00
pytorch-api.md Add placeholder for the User Guide (#159379) 2025-08-13 14:56:04 +00:00
quantization-support.md Convert to markdown: quantization-accuracy-debugging.rst, quantization-backend-configuration.rst, quantization-support.rst, random.rst (#155520) 2025-06-18 18:46:04 +00:00
quantization.rst Remove the uncessary empty file (#160728) 2025-08-19 10:54:08 +00:00
random.md Convert to markdown: quantization-accuracy-debugging.rst, quantization-backend-configuration.rst, quantization-support.rst, random.rst (#155520) 2025-06-18 18:46:04 +00:00
rpc.md RPC tutorial audit (#157938) 2025-07-10 14:15:37 +00:00
signal.md Convert rst to md: rpc.rst, signal.rst, size.rst, special.rst (#155430) 2025-06-18 01:27:04 +00:00
size.md Convert rst to md: rpc.rst, signal.rst, size.rst, special.rst (#155430) 2025-06-18 01:27:04 +00:00
sparse.rst [BE] fix typos in docs/ (#156080) 2025-06-21 02:47:32 +00:00
special.md Convert rst to md: rpc.rst, signal.rst, size.rst, special.rst (#155430) 2025-06-18 01:27:04 +00:00
storage.rst Super tiny fix typo (#151212) 2025-04-14 16:47:40 +00:00
tensor_attributes.rst revamp dtype documentation for 2025 (#156087) 2025-06-27 13:10:23 +00:00
tensor_view.rst
tensorboard.rst
tensors.rst revamp dtype documentation for 2025 (#156087) 2025-06-27 13:10:23 +00:00
testing.md Convert to markdown: testing.rst, threading_environment_variables.rst, torch_cuda_memory.rst, torch_environment_variables.rst, torch_nccl_environment_variables.rst (#155523) 2025-06-10 20:38:36 +00:00
threading_environment_variables.md Convert to markdown: testing.rst, threading_environment_variables.rst, torch_cuda_memory.rst, torch_environment_variables.rst, torch_nccl_environment_variables.rst (#155523) 2025-06-10 20:38:36 +00:00
torch_cuda_memory.md Fixes broken memory_viz link in CUDA memory docs (#161426) 2025-09-02 02:06:54 +00:00
torch_environment_variables.md Convert to markdown: testing.rst, threading_environment_variables.rst, torch_cuda_memory.rst, torch_environment_variables.rst, torch_nccl_environment_variables.rst (#155523) 2025-06-10 20:38:36 +00:00
torch_nccl_environment_variables.md Convert to markdown: testing.rst, threading_environment_variables.rst, torch_cuda_memory.rst, torch_environment_variables.rst, torch_nccl_environment_variables.rst (#155523) 2025-06-10 20:38:36 +00:00
torch.aliases.md Remove torch.functional entries from the doc ignore list (#158581) 2025-07-25 17:19:01 +00:00
torch.compiler_aot_inductor_debugging_guide.md [doc] AOTI debugging guide (#160430) 2025-08-14 23:42:17 +00:00
torch.compiler_aot_inductor_minifier.md Converting .rst files to .md files (#155377) 2025-06-13 22:54:27 +00:00
torch.compiler_aot_inductor.md [doc] AOTI debugging guide (#160430) 2025-08-14 23:42:17 +00:00
torch.compiler_api.md Implement guard collectives (optimized version) (#156562) 2025-06-24 04:59:49 +00:00
torch.compiler_backward.md Add AOTDispatcher config to set backward autocast behavior (#156356) 2025-06-27 14:58:58 +00:00
torch.compiler_cudagraph_trees.md [Graph Partition] add graph partition doc (#159450) 2025-07-30 17:01:10 +00:00
torch.compiler_custom_backends.md DOC: Convert to markdown: torch.compiler_best_practices_for_backends.rst, torch.compiler_cudagraph_trees.rst, torch.compiler_custom_backends.rst, torch.compiler_dynamic_shapes.rst, torch.compiler_dynamo_deepdive.rst (#155137) 2025-06-10 20:51:05 +00:00
torch.compiler_dynamic_shapes.md DOC: Convert to markdown: torch.compiler_best_practices_for_backends.rst, torch.compiler_cudagraph_trees.rst, torch.compiler_custom_backends.rst, torch.compiler_dynamic_shapes.rst, torch.compiler_dynamo_deepdive.rst (#155137) 2025-06-10 20:51:05 +00:00
torch.compiler_dynamo_deepdive.md Dynamo Deep Dive Documentation Fix (#158860) 2025-08-12 08:53:33 +00:00
torch.compiler_dynamo_overview.md convert: rst to myst pr 1/2 (#155840) 2025-06-13 18:02:28 +00:00
torch.compiler_fake_tensor.md convert: rst to myst pr 1/2 (#155840) 2025-06-13 18:02:28 +00:00
torch.compiler_faq.md convert: rst to myst pr2/2 (#155911) 2025-06-16 00:44:44 +00:00
torch.compiler_fine_grain_apis.md convert: rst to myst pr2/2 (#155911) 2025-06-16 00:44:44 +00:00
torch.compiler_get_started.md convert: rst to myst pr2/2 (#155911) 2025-06-16 00:44:44 +00:00
torch.compiler_inductor_profiling.md Convert compiler rst files to markdown (#155335) 2025-06-10 01:12:11 +00:00
torch.compiler_inductor_provenance.rst Update provenance tracking doc (#154062) 2025-05-23 17:09:52 +00:00
torch.compiler_ir.md [export] Update docs (#157750) 2025-07-16 19:53:12 +00:00
torch.compiler_nn_module.md Convert compiler rst files to markdown (#155335) 2025-06-10 01:12:11 +00:00
torch.compiler_performance_dashboard.md Convert compiler rst files to markdown (#155335) 2025-06-10 01:12:11 +00:00
torch.compiler_profiling_torch_compile.md [Docs] Update PT2 Profiler Torch-Compiled Region Image (#158066) 2025-07-11 07:56:45 +00:00
torch.compiler_transformations.md [Docs] Convert to markdown: torch.compiler_transformations.rst, torch.compiler.config.rst (#155347) 2025-06-11 18:55:30 +00:00
torch.compiler_troubleshooting_old.md Add torch compile force disable caches alias (#158072) 2025-08-02 23:23:17 +00:00
torch.compiler_troubleshooting.md [doc] AOTI debugging guide (#160430) 2025-08-14 23:42:17 +00:00
torch.compiler.config.md [Docs] Convert to markdown: torch.compiler_transformations.rst, torch.compiler.config.rst (#155347) 2025-06-11 18:55:30 +00:00
torch.compiler.md [dynamo, docs] programming model dynamo core concepts (#157985) 2025-07-29 01:53:34 +00:00
torch.overrides.md DOC: Convert to markdown: torch.overrides.rst, type_info.rst, utils.rst, xpu.rst (#155088) 2025-06-06 20:16:13 +00:00
torch.rst Remove torch.functional entries from the doc ignore list (#158581) 2025-07-25 17:19:01 +00:00
type_info.md finfo eps doc fix (#160502) 2025-08-14 01:49:35 +00:00
utils.md DOC: Convert to markdown: torch.overrides.rst, type_info.rst, utils.rst, xpu.rst (#155088) 2025-06-06 20:16:13 +00:00
xpu.md DOC: Convert to markdown: torch.overrides.rst, type_info.rst, utils.rst, xpu.rst (#155088) 2025-06-06 20:16:13 +00:00