pytorch/torch
Simon Fan a80eb84a5f [ca] support higher order gradients (create_graph=True) (#153222)
Adds create_graph support if you don't compile or compile only with torch.compile(backend="eager").

Using a backend that uses AOTDispatch produces a post-dispatch AOT backward, where its double backward will be silently incorrect if the forward trace involved any ops that are not composite implicit.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153222
Approved by: https://github.com/jansel
ghstack dependencies: #153193
2025-05-13 16:42:09 +00:00
..
_awaits
_C [Memento] Add PT2 to Memory Snapshot (#152707) 2025-05-12 21:12:51 +00:00
_C_flatbuffer
_custom_op
_decomp Fix torch.isin decomposition for scalar inputs (#153216) 2025-05-09 20:26:25 +00:00
_dispatch
_dynamo [ca] support higher order gradients (create_graph=True) (#153222) 2025-05-13 16:42:09 +00:00
_export Revert "[export][cond] support merging constant ints as unbacked symint (#152742)" 2025-05-12 23:06:33 +00:00
_functorch [compile-time traces] Profile large missing gaps in compile time (#151256) 2025-05-13 14:44:51 +00:00
_higher_order_ops Revert "[export][cond] support merging constant ints as unbacked symint (#152742)" 2025-05-12 23:06:33 +00:00
_inductor [compile-time traces] Profile large missing gaps in compile time (#151256) 2025-05-13 14:44:51 +00:00
_lazy
_library Add torch._C.Tag.needs_contiguous_strides (#152859) 2025-05-08 04:49:59 +00:00
_logging [export] Beef up guard_added logs (#149465) 2025-03-20 23:02:07 +00:00
_numpy Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
_prims
_prims_common consolidate guard_or_x and definitely_x (#152463) 2025-05-02 18:08:11 +00:00
_refs [dynamic shapes] use try-catch instead of guard_or_true for reshape_view_helper (#152638) 2025-05-06 00:54:24 +00:00
_strobelight
_subclasses [BE]: Update ruff to 0.11.8 (#153249) 2025-05-12 18:30:52 +00:00
_vendor
accelerator Add torch.accelerator.device_index as accelerator's device switch context (#148864) 2025-04-25 09:45:25 +00:00
amp [Intel GPU] skip a cuda api call in amp to save some host overhead on xpu (#151111) 2025-04-13 06:37:07 +00:00
ao [BE]: Update ruff to 0.11.8 (#153249) 2025-05-12 18:30:52 +00:00
autograd [autograd][docs] Add more details on why save_for_backward is important in extending autograd note (#153005) 2025-05-09 16:36:57 +00:00
backends Revert "refine fp32 precision api (#125888)" 2025-05-11 00:35:46 +00:00
compiler [MegaCache] Return None on no compilation (#151921) 2025-04-23 04:32:06 +00:00
contrib
cpu
csrc [ca] support higher order gradients (create_graph=True) (#153222) 2025-05-13 16:42:09 +00:00
cuda make use_mem_pool threadlocal (#153356) 2025-05-13 00:16:07 +00:00
distributed [PP] Optimize memory usage by releasing output memory earlier (#153383) 2025-05-13 14:42:38 +00:00
distributions [Docs] Add Description of validate_args for torch.distributions (#152173) 2025-04-30 18:01:20 +00:00
export [export] Exporter API prototype. (#153205) 2025-05-11 14:20:09 +00:00
fft
func
futures
fx include user stacks with constraint violation error message (#152924) 2025-05-10 13:36:47 +00:00
jit Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
legacy
lib [1/N] Use internal linkage in torch/csrc C++ files. (#150930) 2025-04-11 02:19:31 +00:00
linalg Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
masked [BE]: Update ruff to 0.11.8 (#153249) 2025-05-12 18:30:52 +00:00
monitor
mps [MPS] Make torch.mps.compile_shader public (#148972) 2025-03-11 20:20:58 +00:00
mtia [Kineto] Enable OOM observer (#152160) 2025-04-27 15:56:44 +00:00
multiprocessing
nativert [nativert] Address tooling setup for torch/nativert/ (#153164) 2025-05-08 21:11:33 +00:00
nested [aotd] Guess tangents stride as output strides (#144579) 2025-03-20 15:41:36 +00:00
nn [BE]: Update ruff to 0.11.8 (#153249) 2025-05-12 18:30:52 +00:00
onnx [BE]: Update ruff to 0.11.8 (#153249) 2025-05-12 18:30:52 +00:00
optim [BE]: Improve decorator typing for Optimizer subclasses (#153374) 2025-05-12 22:55:25 +00:00
package
profiler [profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124) 2025-04-15 16:11:49 +00:00
quantization
signal
sparse Revert "has_triton: Use the device interface for detecting Triton availability (#139171)" 2025-05-10 14:46:23 +00:00
special
testing torch.tensordot: performance improvements when contracting to a scalar. (#145936) 2025-05-13 10:57:30 +00:00
utils [MemoryZ] Sync changes to internal page (#153166) 2025-05-13 15:35:10 +00:00
xpu Correct torch.xpu.is_bf16_supported return False if no XPU detected (#152317) 2025-05-06 10:03:17 +00:00
__config__.py
__future__.py
__init__.py Detect NVSHMEM location (#153010) 2025-05-07 23:35:04 +00:00
_appdirs.py Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py [invoke_subgraph] Cache on tangent metadata and retrace if needed (#152357) 2025-04-30 23:49:17 +00:00
_jit_internal.py
_linalg_utils.py
_lobpcg.py Fixed rerr computation in lobpcg (#152789) 2025-05-08 12:22:31 +00:00
_lowrank.py
_meta_registrations.py error out on negative offs or on K=0 in group gemm (#153226) 2025-05-10 01:13:18 +00:00
_namedtensor_internals.py
_ops.py Introduce unsafe way to mark functions as cacheable (#151603) 2025-04-21 17:37:38 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
_tensor_str.py add torch.float4_e2m1fn_x2 to PyTorch (#148791) 2025-03-27 17:32:20 +00:00
_tensor.py Avoid triggering ignored requires_grad warning in our code (#152686) 2025-05-05 23:56:40 +00:00
_thread_safe_fork.py
_torch_docs.py Fix the basic description of torch.min(), torch.max(), torch.all(), torch.any() (#152658) 2025-05-08 22:59:14 +00:00
_utils_internal.py [reland] Add graph module runtime asserts to AOTI (#153182) 2025-05-09 22:56:19 +00:00
_utils.py
_VF.py
_vmap_internals.py Fix broken URLs (#152237) 2025-04-27 09:56:42 +00:00
_weights_only_unpickler.py
CMakeLists.txt Add new dependences for gen_pyi.py (#150391) 2025-04-03 14:18:18 +00:00
custom_class_detail.h
custom_class.h
extension.h
functional.py Optimize cdist param description (#151178) 2025-04-14 13:53:10 +00:00
hub.py
library.h Overload Library::def rather than templating it (#151626) 2025-04-18 22:51:16 +00:00
library.py fix spammy library deinit errors when user passes an invalid TORCH_LOGS argument (#151678) 2025-04-22 20:13:52 +00:00
overrides.py [CUDA][cuBLAS] Aten GEMM overload for FP32 output from FP16/BF16 inputs (#150812) 2025-04-18 01:53:26 +00:00
py.typed
quasirandom.py
random.py Update description for torch.random.fork_rng (#151881) 2025-04-23 16:59:29 +00:00
README.md Rename README.txt to README.md (#149811) 2025-03-24 22:33:33 +00:00
return_types.py
script.h
serialization.py Move get accelerator to use build time flags when possible (#146098) 2025-03-10 13:17:58 +00:00
storage.py add torch.float4_e2m1fn_x2 to PyTorch (#148791) 2025-03-27 17:32:20 +00:00
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]


TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.