pytorch/torch
Meet Vadakkanchery fb6ac2f161 [DCP] Add logging for _stateful_to_state_dict(), stage_state_dict(), and synchronize_staging() (#151320)
Summary: As titled.

Test Plan: CI

Differential Revision: D73040700

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151320
Approved by: https://github.com/saumishr
2025-04-17 01:08:32 +00:00
..
_awaits
_C Reapply "ProcessGroupGloo: support lazy_init (#150801)" (#151031) 2025-04-11 01:58:35 +00:00
_C_flatbuffer
_custom_op
_decomp Make torch._chunk_cat support non-contiguous inputs (#151263) 2025-04-16 04:18:46 +00:00
_dispatch [BE][PYFMT] migrate PYFMT for torch._dynamo to ruff format (#144549) 2025-02-28 03:03:53 +00:00
_dynamo [compile][compile time traces] Add more dynamo traces (#151357) 2025-04-16 20:37:08 +00:00
_export [dynamic shapes] add sym_and, sym_or (#150456) 2025-04-14 18:18:06 +00:00
_functorch [aot autograd][logging] Profile large missing gaps in compile time tracing (#151256) 2025-04-16 20:37:08 +00:00
_higher_order_ops [HOP] Reworked DispatchKey.Autograd (#151107) 2025-04-15 19:55:46 +00:00
_inductor improve noop elimination for view (#151095) 2025-04-16 23:55:32 +00:00
_lazy
_library Generate meta kernel with operator profiles (#150807) 2025-04-14 19:28:54 +00:00
_logging [export] Beef up guard_added logs (#149465) 2025-03-20 23:02:07 +00:00
_numpy
_prims Support torch.compile rng selective activation checkpointing with cudagraph (#146878) 2025-02-28 00:47:03 +00:00
_prims_common Propagate callable parameter types using ParamSpec (#142306) (#151014) 2025-04-13 20:38:11 +00:00
_refs Remove guard_size_oblivious from vector_norm decomposition. (#148809) 2025-04-10 16:19:00 +00:00
_strobelight Enable strobelight profiling specific compile frame ids using COMPILE_STROBELIGHT_FRAME_FILTER (#147549) 2025-02-22 03:44:53 +00:00
_subclasses Fix assert_tensor_meta (#150808) 2025-04-14 19:28:54 +00:00
_vendor
accelerator Delegate torch.accelerator.device_count to torch.xxx.device_count for multi-process usage (#149924) 2025-04-10 02:37:37 +00:00
amp [Intel GPU] skip a cuda api call in amp to save some host overhead on xpu (#151111) 2025-04-13 06:37:07 +00:00
ao [BE][1/2] Move original_weights_lookup attribute to constant (#151241) 2025-04-16 00:41:25 +00:00
autograd [Profiler][HPU] Enable profiler.key_averages().table() for HPU devices (#150770) 2025-04-11 05:17:12 +00:00
backends Expose is_available API for torch.backends.mkldnn (#147432) 2025-04-10 05:05:37 +00:00
compiler Add inductor standalone_compile API (#150670) 2025-04-15 23:38:15 +00:00
contrib
cpu [CPU Stream] Add noop for CPU stream record_event() and wait_event() (#145935) 2025-02-20 18:50:55 +00:00
csrc [AOTInductor] Add interface for user managed buffer in package api. (#151325) 2025-04-16 04:25:40 +00:00
cuda Delegate torch.accelerator.device_count to torch.xxx.device_count for multi-process usage (#149924) 2025-04-10 02:37:37 +00:00
distributed [DCP] Add logging for _stateful_to_state_dict(), stage_state_dict(), and synchronize_staging() (#151320) 2025-04-17 01:08:32 +00:00
distributions [typing] Add type hints to __init__ methods in torch.distributions. (#144197) 2025-04-06 17:50:35 +00:00
export [export] Add draft-export to error msg (#151065) 2025-04-16 08:56:02 +00:00
fft
func
futures PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
fx Revert "[ez] Make relaxed constraint error message more user friendly (#151407)" 2025-04-16 20:40:22 +00:00
jit Fix torchscript issues with reference quantized modules (#150870) 2025-04-10 20:14:45 +00:00
legacy
lib [1/N] Use internal linkage in torch/csrc C++ files. (#150930) 2025-04-11 02:19:31 +00:00
linalg Implement gradient for the residuals of torch.linalg.lstsq (#148526) 2025-03-10 12:35:09 +00:00
masked Use variadic length tuple for torch.masked.DimOrDims (#149870) 2025-03-31 07:06:58 +00:00
monitor
mps [MPS] Make torch.mps.compile_shader public (#148972) 2025-03-11 20:20:58 +00:00
mtia [MTIA] Add _mtia_maybeExchangeDevice to MTIA module (#149340) 2025-03-18 15:15:12 +00:00
multiprocessing
nested [aotd] Guess tangents stride as output strides (#144579) 2025-03-20 15:41:36 +00:00
nn Allow to run flex_attention on HPU (#148656) 2025-04-16 19:49:15 +00:00
onnx [ONNX] Add a comment for handling bf16/fp8 tensor to numpy conversion (#151371) 2025-04-16 00:49:38 +00:00
optim Optimize typing in lr_scheduler.py (#151219) 2025-04-15 01:00:13 +00:00
package Remove code for Python < 3.9 (#147097) 2025-02-14 03:22:49 +00:00
profiler [profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124) 2025-04-15 16:11:49 +00:00
quantization
signal
sparse Fix spelling (#149277) 2025-03-20 01:02:32 +00:00
special
testing Make torch._chunk_cat support non-contiguous inputs (#151263) 2025-04-16 04:18:46 +00:00
utils Add ccode for CeilToInt and IntTrueDiv (#151375) 2025-04-16 16:47:55 +00:00
xpu xpu: torch.xpu.get_arch_list() to return [] if xpu not compiled (#147431) 2025-02-24 01:35:54 +00:00
__config__.py
__future__.py
__init__.py [profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124) 2025-04-15 16:11:49 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py [dynamo][invoke_subgraph] Use FxGraphModule comparison instead of hashing (#150911) 2025-04-14 23:34:26 +00:00
_jit_internal.py [BE][CI] bump ruff to 0.9.2: multiline assert statements (#144546) 2025-02-27 20:46:16 +00:00
_linalg_utils.py
_lobpcg.py [BE][CI] bump ruff to 0.9.2: multiline assert statements (#144546) 2025-02-27 20:46:16 +00:00
_lowrank.py
_meta_registrations.py [dtensor] add op support for torch._grouped_mm (#151072) 2025-04-12 07:07:44 +00:00
_namedtensor_internals.py
_ops.py [HOP] Reworked DispatchKey.Autograd (#151107) 2025-04-15 19:55:46 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py [Docs] Clarify behavior when integer dtype is used with requires_grad=True in tensor.to() (#150913) 2025-04-10 02:52:58 +00:00
_tensor_str.py add torch.float4_e2m1fn_x2 to PyTorch (#148791) 2025-03-27 17:32:20 +00:00
_tensor.py Revert "Fix non-bitwise type annotations for Tensor operators (see #145838) (#146845)" 2025-02-18 19:01:27 +00:00
_thread_safe_fork.py
_torch_docs.py Fix keepdim param optional description (#151197) 2025-04-16 23:15:30 +00:00
_utils_internal.py [profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124) 2025-04-15 16:11:49 +00:00
_utils.py Allow torch.load under FakeTensorMode to load FakeTensors with correct devices (for plain Tensors) (#147786) 2025-03-06 12:04:32 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py Add sparse tensors constructed via legacy constructor to _sparse_tensors_to_validate (#147759) 2025-02-25 23:51:12 +00:00
CMakeLists.txt Add new dependences for gen_pyi.py (#150391) 2025-04-03 14:18:18 +00:00
custom_class_detail.h
custom_class.h Remove unneeded Clang-tidy suppression (#148246) 2025-03-01 16:51:54 +00:00
extension.h
functional.py Optimize cdist param description (#151178) 2025-04-14 13:53:10 +00:00
hub.py [BE][CI][Easy] bump ruff to 0.9.0: long statements in docstrings (#146509) 2025-02-24 19:56:08 +00:00
library.h [pytorch] add header docs for TORCH_LIBRARY_THREAD_UNSAFE_LAZY_INIT (#150854) 2025-04-09 12:59:24 +00:00
library.py [custom ops] Fix destroy function (#151299) 2025-04-16 06:18:09 +00:00
overrides.py Propagate callable parameter types using ParamSpec (#142306) (#151014) 2025-04-13 20:38:11 +00:00
py.typed
quasirandom.py
random.py
README.md Rename README.txt to README.md (#149811) 2025-03-24 22:33:33 +00:00
return_types.py
script.h
serialization.py Move get accelerator to use build time flags when possible (#146098) 2025-03-10 13:17:58 +00:00
storage.py add torch.float4_e2m1fn_x2 to PyTorch (#148791) 2025-03-27 17:32:20 +00:00
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]


TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.