pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

Laith Sakka cbf8e0fb1a use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . (#148893 ) this was discussed with @eellison and he recommended using statically_known_true here, the intuition is. We already have 0/1 specializations in place, if we reach those checks with dynamic shapes that are not already specialized then we do not want them to specialize them, "a recompilation here is not justified". Those are all non-semantic changing optimizations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148893 Approved by: https://github.com/eellison		2025-04-28 16:44:25 +00:00
..
_awaits
_C	[Kineto] Enable OOM observer (#152160 )	2025-04-27 15:56:44 +00:00
_C_flatbuffer
_custom_op
_decomp	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
_dispatch	[BE][PYFMT] migrate PYFMT for `torch._dynamo` to `ruff format` (#144549 )	2025-02-28 03:03:53 +00:00
_dynamo	[BE] Migrate dtype_abbrs into one location (#152229 )	2025-04-28 03:52:47 +00:00
_export	[export] improve error message for deserializing custom triton op (#152029 )	2025-04-24 20:22:05 +00:00
_functorch	Correctly handle duplicated arguments when merging input views. (#146275 )	2025-04-26 14:50:16 +00:00
_higher_order_ops	[Typing] Enable torch.types.IntLikeType / FloatLikeType / BoolLikeType (#152157 )	2025-04-25 19:00:10 +00:00
_inductor	use statically known true instead of guard size oblivious in bmm and mm inductor decompositions . (#148893 )	2025-04-28 16:44:25 +00:00
_lazy
_library	[torchbind] fix error message when attr is a real tensor. (#151944 )	2025-04-23 17:32:11 +00:00
_logging	[export] Beef up guard_added logs (#149465 )	2025-03-20 23:02:07 +00:00
_numpy	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
_prims	Support torch.compile rng selective activation checkpointing with cudagraph (#146878 )	2025-02-28 00:47:03 +00:00
_prims_common	[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims (#150127 )	2025-04-23 05:42:30 +00:00
_refs	[MPSInductor] Fix masked_fill decomp (#152268 )	2025-04-27 15:50:46 +00:00
_strobelight	Enable strobelight profiling specific compile frame ids using COMPILE_STROBELIGHT_FRAME_FILTER (#147549 )	2025-02-22 03:44:53 +00:00
_subclasses	[dynamo] Add guard serialization for tensor matches. (#151318 )	2025-04-25 14:16:23 +00:00
_vendor
accelerator	Add torch.accelerator.device_index as accelerator's device switch context (#148864 )	2025-04-25 09:45:25 +00:00
amp	[Intel GPU] skip a cuda api call in amp to save some host overhead on xpu (#151111 )	2025-04-13 06:37:07 +00:00
ao	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
autograd	[BE] Migrate dtype_abbrs into one location (#152229 )	2025-04-28 03:52:47 +00:00
backends	Expose is_available API for torch.backends.mkldnn (#147432 )	2025-04-10 05:05:37 +00:00
compiler	[MegaCache] Return None on no compilation (#151921 )	2025-04-23 04:32:06 +00:00
contrib
cpu	[CPU Stream] Add noop for CPU stream record_event() and wait_event() (#145935 )	2025-02-20 18:50:55 +00:00
csrc	Fix clang-tidy suppression in torch/csrc/jit (#152271 )	2025-04-27 21:18:39 +00:00
cuda	Add option to use mempool on OOM (#151487 )	2025-04-26 04:04:57 +00:00
distributed	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
distributions	add generalized pareto distribution (GPD) (#135968 )	2025-04-17 18:51:02 +00:00
export	[DRAFT] INitial version of sticky export (#151047 )	2025-04-23 22:58:43 +00:00
fft
func
futures	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
fx	[BE] Migrate dtype_abbrs into one location (#152229 )	2025-04-28 03:52:47 +00:00
jit	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
legacy
lib	[1/N] Use internal linkage in torch/csrc C++ files. (#150930 )	2025-04-11 02:19:31 +00:00
linalg	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
masked	[BE][Easy]: Dedupe a TypeAlias in PrimsCommon (#151565 )	2025-04-17 19:59:41 +00:00
monitor
mps	[MPS] Make `torch.mps.compile_shader` public (#148972 )	2025-03-11 20:20:58 +00:00
mtia	[Kineto] Enable OOM observer (#152160 )	2025-04-27 15:56:44 +00:00
multiprocessing
nested	[aotd] Guess tangents stride as output strides (#144579 )	2025-03-20 15:41:36 +00:00
nn	Refactor to use torch.accelerator.device_index instead of torch.cuda.device for generic device context manager (#148880 )	2025-04-25 09:45:25 +00:00
onnx	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
optim	Add scripts to check xrefs and urls (#151844 )	2025-04-28 09:30:07 +00:00
package
profiler	[profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 )	2025-04-15 16:11:49 +00:00
quantization
signal
sparse	Fix spelling (#149277 )	2025-03-20 01:02:32 +00:00
special
testing	[BE] Migrate dtype_abbrs into one location (#152229 )	2025-04-28 03:52:47 +00:00
utils	[BE] Migrate dtype_abbrs into one location (#152229 )	2025-04-28 03:52:47 +00:00
xpu	xpu: torch.xpu.get_arch_list() to return [] if xpu not compiled (#147431 )	2025-02-24 01:35:54 +00:00
__config__.py
__future__.py
__init__.py	[profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 )	2025-04-15 16:11:49 +00:00
_appdirs.py	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py	[dynamo] Add guard serialization for tensor matches. (#151318 )	2025-04-25 14:16:23 +00:00
_jit_internal.py	[BE][CI] bump `ruff` to 0.9.2: multiline `assert` statements (#144546 )	2025-02-27 20:46:16 +00:00
_linalg_utils.py
_lobpcg.py	Add scripts to check xrefs and urls (#151844 )	2025-04-28 09:30:07 +00:00
_lowrank.py
_meta_registrations.py	Non-deterministic alert in histc_cuda for floating types only (#151701 )	2025-04-24 21:16:46 +00:00
_namedtensor_internals.py
_ops.py	Introduce unsafe way to mark functions as cacheable (#151603 )	2025-04-21 17:37:38 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
_tensor_str.py	add `torch.float4_e2m1fn_x2` to PyTorch (#148791 )	2025-03-27 17:32:20 +00:00
_tensor.py	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
_thread_safe_fork.py
_torch_docs.py	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
_utils_internal.py	[profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 )	2025-04-15 16:11:49 +00:00
_utils.py	Allow torch.load under FakeTensorMode to load FakeTensors with correct devices (for plain Tensors) (#147786 )	2025-03-06 12:04:32 +00:00
_VF.py
_vmap_internals.py	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
_weights_only_unpickler.py	Add sparse tensors constructed via legacy constructor to _sparse_tensors_to_validate (#147759 )	2025-02-25 23:51:12 +00:00
CMakeLists.txt	Add new dependences for gen_pyi.py (#150391 )	2025-04-03 14:18:18 +00:00
custom_class_detail.h
custom_class.h	Remove unneeded Clang-tidy suppression (#148246 )	2025-03-01 16:51:54 +00:00
extension.h
functional.py	Optimize `cdist` param description (#151178 )	2025-04-14 13:53:10 +00:00
hub.py	[BE][CI][Easy] bump `ruff` to 0.9.0: long statements in docstrings (#146509 )	2025-02-24 19:56:08 +00:00
library.h	Overload Library::def rather than templating it (#151626 )	2025-04-18 22:51:16 +00:00
library.py	fix spammy library deinit errors when user passes an invalid TORCH_LOGS argument (#151678 )	2025-04-22 20:13:52 +00:00
overrides.py	[CUDA][cuBLAS] Aten GEMM overload for FP32 output from FP16/BF16 inputs (#150812 )	2025-04-18 01:53:26 +00:00
py.typed
quasirandom.py
random.py	Update description for torch.random.fork_rng (#151881 )	2025-04-23 16:59:29 +00:00
README.md	Rename README.txt to README.md (#149811 )	2025-03-24 22:33:33 +00:00
return_types.py
script.h
serialization.py	Move get accelerator to use build time flags when possible (#146098 )	2025-03-10 13:17:58 +00:00
storage.py	add `torch.float4_e2m1fn_x2` to PyTorch (#148791 )	2025-03-27 17:32:20 +00:00
torch_version.py
types.py
version.py.tpl

README.md

Note [TH abstraction violation]


TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.