pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

Meet Vadakkanchery fb6ac2f161 [DCP] Add logging for _stateful_to_state_dict(), stage_state_dict(), and synchronize_staging() (#151320 ) Summary: As titled. Test Plan: CI Differential Revision: D73040700 Pull Request resolved: https://github.com/pytorch/pytorch/pull/151320 Approved by: https://github.com/saumishr		2025-04-17 01:08:32 +00:00
..
_awaits
_C	Reapply "ProcessGroupGloo: support lazy_init (#150801 )" (#151031 )	2025-04-11 01:58:35 +00:00
_C_flatbuffer
_custom_op
_decomp	Make torch._chunk_cat support non-contiguous inputs (#151263 )	2025-04-16 04:18:46 +00:00
_dispatch	[BE][PYFMT] migrate PYFMT for `torch._dynamo` to `ruff format` (#144549 )	2025-02-28 03:03:53 +00:00
_dynamo	[compile][compile time traces] Add more dynamo traces (#151357 )	2025-04-16 20:37:08 +00:00
_export	[dynamic shapes] add sym_and, sym_or (#150456 )	2025-04-14 18:18:06 +00:00
_functorch	[aot autograd][logging] Profile large missing gaps in compile time tracing (#151256 )	2025-04-16 20:37:08 +00:00
_higher_order_ops	[HOP] Reworked DispatchKey.Autograd (#151107 )	2025-04-15 19:55:46 +00:00
_inductor	improve noop elimination for view (#151095 )	2025-04-16 23:55:32 +00:00
_lazy
_library	Generate meta kernel with operator profiles (#150807 )	2025-04-14 19:28:54 +00:00
_logging	[export] Beef up guard_added logs (#149465 )	2025-03-20 23:02:07 +00:00
_numpy
_prims	Support torch.compile rng selective activation checkpointing with cudagraph (#146878 )	2025-02-28 00:47:03 +00:00
_prims_common	Propagate callable parameter types using ParamSpec (#142306 ) (#151014 )	2025-04-13 20:38:11 +00:00
_refs	Remove guard_size_oblivious from vector_norm decomposition. (#148809 )	2025-04-10 16:19:00 +00:00
_strobelight	Enable strobelight profiling specific compile frame ids using COMPILE_STROBELIGHT_FRAME_FILTER (#147549 )	2025-02-22 03:44:53 +00:00
_subclasses	Fix assert_tensor_meta (#150808 )	2025-04-14 19:28:54 +00:00
_vendor
accelerator	Delegate torch.accelerator.device_count to torch.xxx.device_count for multi-process usage (#149924 )	2025-04-10 02:37:37 +00:00
amp	[Intel GPU] skip a cuda api call in amp to save some host overhead on xpu (#151111 )	2025-04-13 06:37:07 +00:00
ao	[BE][1/2] Move original_weights_lookup attribute to constant (#151241 )	2025-04-16 00:41:25 +00:00
autograd	[Profiler][HPU] Enable profiler.key_averages().table() for HPU devices (#150770 )	2025-04-11 05:17:12 +00:00
backends	Expose is_available API for torch.backends.mkldnn (#147432 )	2025-04-10 05:05:37 +00:00
compiler	Add inductor standalone_compile API (#150670 )	2025-04-15 23:38:15 +00:00
contrib
cpu	[CPU Stream] Add noop for CPU stream record_event() and wait_event() (#145935 )	2025-02-20 18:50:55 +00:00
csrc	[AOTInductor] Add interface for user managed buffer in package api. (#151325 )	2025-04-16 04:25:40 +00:00
cuda	Delegate torch.accelerator.device_count to torch.xxx.device_count for multi-process usage (#149924 )	2025-04-10 02:37:37 +00:00
distributed	[DCP] Add logging for _stateful_to_state_dict(), stage_state_dict(), and synchronize_staging() (#151320 )	2025-04-17 01:08:32 +00:00
distributions	[typing] Add type hints to `__init__` methods in `torch.distributions`. (#144197 )	2025-04-06 17:50:35 +00:00
export	[export] Add draft-export to error msg (#151065 )	2025-04-16 08:56:02 +00:00
fft
func
futures	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
fx	Revert "[ez] Make relaxed constraint error message more user friendly (#151407 )"	2025-04-16 20:40:22 +00:00
jit	Fix torchscript issues with reference quantized modules (#150870 )	2025-04-10 20:14:45 +00:00
legacy
lib	[1/N] Use internal linkage in torch/csrc C++ files. (#150930 )	2025-04-11 02:19:31 +00:00
linalg	Implement gradient for the `residuals` of `torch.linalg.lstsq` (#148526 )	2025-03-10 12:35:09 +00:00
masked	Use variadic length tuple for `torch.masked.DimOrDims` (#149870 )	2025-03-31 07:06:58 +00:00
monitor
mps	[MPS] Make `torch.mps.compile_shader` public (#148972 )	2025-03-11 20:20:58 +00:00
mtia	[MTIA] Add _mtia_maybeExchangeDevice to MTIA module (#149340 )	2025-03-18 15:15:12 +00:00
multiprocessing
nested	[aotd] Guess tangents stride as output strides (#144579 )	2025-03-20 15:41:36 +00:00
nn	Allow to run flex_attention on HPU (#148656 )	2025-04-16 19:49:15 +00:00
onnx	[ONNX] Add a comment for handling bf16/fp8 tensor to numpy conversion (#151371 )	2025-04-16 00:49:38 +00:00
optim	Optimize typing in `lr_scheduler.py` (#151219 )	2025-04-15 01:00:13 +00:00
package	Remove code for Python < 3.9 (#147097 )	2025-02-14 03:22:49 +00:00
profiler	[profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 )	2025-04-15 16:11:49 +00:00
quantization
signal
sparse	Fix spelling (#149277 )	2025-03-20 01:02:32 +00:00
special
testing	Make torch._chunk_cat support non-contiguous inputs (#151263 )	2025-04-16 04:18:46 +00:00
utils	Add ccode for CeilToInt and IntTrueDiv (#151375 )	2025-04-16 16:47:55 +00:00
xpu	xpu: torch.xpu.get_arch_list() to return [] if xpu not compiled (#147431 )	2025-02-24 01:35:54 +00:00
__config__.py
__future__.py
__init__.py	[profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 )	2025-04-15 16:11:49 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py	[dynamo][invoke_subgraph] Use FxGraphModule comparison instead of hashing (#150911 )	2025-04-14 23:34:26 +00:00
_jit_internal.py	[BE][CI] bump `ruff` to 0.9.2: multiline `assert` statements (#144546 )	2025-02-27 20:46:16 +00:00
_linalg_utils.py
_lobpcg.py	[BE][CI] bump `ruff` to 0.9.2: multiline `assert` statements (#144546 )	2025-02-27 20:46:16 +00:00
_lowrank.py
_meta_registrations.py	[dtensor] add op support for torch._grouped_mm (#151072 )	2025-04-12 07:07:44 +00:00
_namedtensor_internals.py
_ops.py	[HOP] Reworked DispatchKey.Autograd (#151107 )	2025-04-15 19:55:46 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py	[Docs] Clarify behavior when integer dtype is used with requires_grad=True in `tensor.to()` (#150913 )	2025-04-10 02:52:58 +00:00
_tensor_str.py	add `torch.float4_e2m1fn_x2` to PyTorch (#148791 )	2025-03-27 17:32:20 +00:00
_tensor.py	Revert "Fix non-bitwise type annotations for Tensor operators (see #145838 ) (#146845 )"	2025-02-18 19:01:27 +00:00
_thread_safe_fork.py
_torch_docs.py	Fix `keepdim` param optional description (#151197 )	2025-04-16 23:15:30 +00:00
_utils_internal.py	[profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 )	2025-04-15 16:11:49 +00:00
_utils.py	Allow torch.load under FakeTensorMode to load FakeTensors with correct devices (for plain Tensors) (#147786 )	2025-03-06 12:04:32 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py	Add sparse tensors constructed via legacy constructor to _sparse_tensors_to_validate (#147759 )	2025-02-25 23:51:12 +00:00
CMakeLists.txt	Add new dependences for gen_pyi.py (#150391 )	2025-04-03 14:18:18 +00:00
custom_class_detail.h
custom_class.h	Remove unneeded Clang-tidy suppression (#148246 )	2025-03-01 16:51:54 +00:00
extension.h
functional.py	Optimize `cdist` param description (#151178 )	2025-04-14 13:53:10 +00:00
hub.py	[BE][CI][Easy] bump `ruff` to 0.9.0: long statements in docstrings (#146509 )	2025-02-24 19:56:08 +00:00
library.h	[pytorch] add header docs for TORCH_LIBRARY_THREAD_UNSAFE_LAZY_INIT (#150854 )	2025-04-09 12:59:24 +00:00
library.py	[custom ops] Fix destroy function (#151299 )	2025-04-16 06:18:09 +00:00
overrides.py	Propagate callable parameter types using ParamSpec (#142306 ) (#151014 )	2025-04-13 20:38:11 +00:00
py.typed
quasirandom.py
random.py
README.md	Rename README.txt to README.md (#149811 )	2025-03-24 22:33:33 +00:00
return_types.py
script.h
serialization.py	Move get accelerator to use build time flags when possible (#146098 )	2025-03-10 13:17:58 +00:00
storage.py	add `torch.float4_e2m1fn_x2` to PyTorch (#148791 )	2025-03-27 17:32:20 +00:00
torch_version.py
types.py
version.py.tpl

README.md

Note [TH abstraction violation]


TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.