pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

fduwjj 06e9deabb6 [c10d][fr] Improve FR dump robustness with all watchdog broadcast wait and more frequent store check (#150652 ) When debugging FR missing dump and missing dump logs, I have couple initial findings: 1. On the same rank, if a second watchdog timeout triggers on a different PG(or subPG), that watchdog thread will immediately throw exception instead of sleeping. We want to fix that by still making the watchdog thread to wait for 1 min. 2. The FR dump takes about 900ms to 1200ms so, we are not checking the store frequently enough. But instead of changing the frequency from 1sec to 300ms, we finally decided to just let all ranks just sleep for 1 min universally rather than using a promise. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150652 Approved by: https://github.com/kwen2501		2025-04-07 16:33:27 +00:00
..
_awaits
_C	Revert "[fx] Move Node._prepend/Node._remove_from_list to C++ (#148261 )" (#150542 )	2025-04-03 21:15:38 +00:00
_C_flatbuffer
_custom_op
_decomp	Remove aten.elu core ATen decomp because it is now core ATen (#149780 )	2025-03-25 01:59:57 +00:00
_dispatch	[BE][PYFMT] migrate PYFMT for `torch._dynamo` to `ruff format` (#144549 )	2025-02-28 03:03:53 +00:00
_dynamo	Generalize compile collective to avoid cuda-bias (#150405 )	2025-04-07 01:54:20 +00:00
_export	[export] Make aoti_call_delegate hop traceable (#148804 )	2025-04-03 20:44:31 +00:00
_functorch	Make CompileEventLogger more defensive w.r.t to AOTAutogradCache and FXGraphCache (#150423 )	2025-04-04 01:55:13 +00:00
_higher_order_ops	[export] Make aoti_call_delegate hop traceable (#148804 )	2025-04-03 20:44:31 +00:00
_inductor	Add RECORD_FUNCTION for AOTI (#150150 )	2025-04-07 15:12:29 +00:00
_lazy
_library	[custom_ops][perf] Move expensive pytree traversals of tensors to C++ (#148555 )	2025-04-01 18:45:48 +00:00
_logging	[export] Beef up guard_added logs (#149465 )	2025-03-20 23:02:07 +00:00
_numpy
_prims	Support torch.compile rng selective activation checkpointing with cudagraph (#146878 )	2025-02-28 00:47:03 +00:00
_prims_common	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
_refs	[export] fix stft decomp and making it consistent with cpp impl. (#149232 )	2025-03-19 18:40:35 +00:00
_strobelight	Enable strobelight profiling specific compile frame ids using COMPILE_STROBELIGHT_FRAME_FILTER (#147549 )	2025-02-22 03:44:53 +00:00
_subclasses	[aoti] Fix cannot determine truth value of Relation error when propagating unbacked symint in lowering (#150570 )	2025-04-03 20:06:15 +00:00
_vendor
accelerator	Move get accelerator to use build time flags when possible (#146098 )	2025-03-10 13:17:58 +00:00
amp	[MPS] grad scaler (#150255 )	2025-04-06 17:06:55 +00:00
ao	[Codemod][AddExplicitStrictExportForTrainingInferenceArg] caffe2/ (#149595 )	2025-04-03 23:50:13 +00:00
autograd	Compare device name of profiler dynamically (#150396 )	2025-04-02 06:06:06 +00:00
backends	[ROCm] change preferred blas lib defaults (#150212 )	2025-03-29 03:33:07 +00:00
compiler	[dynamo] add reason field to torch.compiler.disable (#150341 )	2025-04-02 04:26:48 +00:00
contrib
cpu	[CPU Stream] Add noop for CPU stream record_event() and wait_event() (#145935 )	2025-02-20 18:50:55 +00:00
csrc	[c10d][fr] Improve FR dump robustness with all watchdog broadcast wait and more frequent store check (#150652 )	2025-04-07 16:33:27 +00:00
cuda	[ROCm][TunableOp] Stricter unit tests for online and offline tuning (#150142 )	2025-03-31 04:12:08 +00:00
distributed	[torchrec] update local_shards_wrapper to latest version (#150469 )	2025-04-07 13:00:52 +00:00
distributions	[typing] Add type hints to `__init__` methods in `torch.distributions`. (#144197 )	2025-04-06 17:50:35 +00:00
export	[export] specialize for aten.to (#149235 )	2025-04-03 05:20:10 +00:00
fft
func
futures	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
fx	[MTIA] Map names to operand indices when folding submodules (#150692 )	2025-04-06 03:11:14 +00:00
jit	scriptfunction: Make sure we have valid __name__ and __qualname__ (#147906 )	2025-02-28 23:25:47 +00:00
legacy
lib	[codemod] Fix missing field initializer in caffe2/torch/lib/libshm/manager.cpp +1 (#148393 )	2025-03-04 04:20:04 +00:00
linalg	Implement gradient for the `residuals` of `torch.linalg.lstsq` (#148526 )	2025-03-10 12:35:09 +00:00
masked	Use variadic length tuple for `torch.masked.DimOrDims` (#149870 )	2025-03-31 07:06:58 +00:00
monitor
mps	[MPS] Make `torch.mps.compile_shader` public (#148972 )	2025-03-11 20:20:58 +00:00
mtia	[MTIA] Add _mtia_maybeExchangeDevice to MTIA module (#149340 )	2025-03-18 15:15:12 +00:00
multiprocessing
nested	[aotd] Guess tangents stride as output strides (#144579 )	2025-03-20 15:41:36 +00:00
nn	Move formulas on separate line in loss.py (#150565 )	2025-04-03 20:47:35 +00:00
onnx	[export] refactor _Dim into Dim (#149891 )	2025-03-28 06:19:03 +00:00
optim	[MPS] grad scaler (#150255 )	2025-04-06 17:06:55 +00:00
package	Remove code for Python < 3.9 (#147097 )	2025-02-14 03:22:49 +00:00
profiler	[BE][Ez]: Use itertools.chain.from_iterable when possible (#148190 )	2025-03-06 20:37:06 +00:00
quantization
signal
sparse	Fix spelling (#149277 )	2025-03-20 01:02:32 +00:00
special
testing	cpp_wrapper: Fix even more tests (#147225 )	2025-04-07 14:20:06 +00:00
utils	Revert "bound sympy accuracy (#150383 )"	2025-04-04 16:26:00 +00:00
xpu	xpu: torch.xpu.get_arch_list() to return [] if xpu not compiled (#147431 )	2025-02-24 01:35:54 +00:00
__config__.py
__future__.py
__init__.py	Fix #149806 : Fix path lookup in _preload_cuda_deps (#149808 )	2025-03-25 23:03:47 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py	[dynamo] Always trace into tensor subclass `__torch_function__` (#149792 )	2025-04-02 20:57:00 +00:00
_jit_internal.py	[BE][CI] bump `ruff` to 0.9.2: multiline `assert` statements (#144546 )	2025-02-27 20:46:16 +00:00
_linalg_utils.py
_lobpcg.py	[BE][CI] bump `ruff` to 0.9.2: multiline `assert` statements (#144546 )	2025-02-27 20:46:16 +00:00
_lowrank.py
_meta_registrations.py	enable torch.compile for torch._scaled_mm nvfp4 recipe (#150462 )	2025-04-02 01:08:40 +00:00
_namedtensor_internals.py
_ops.py	Add `Any` return annotation to `__getattr__` methods that return a union of types. (#150204 )	2025-04-02 05:25:07 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py	Add type hints to `_tensor_docs.add_docstr_all` (#150715 )	2025-04-06 22:25:34 +00:00
_tensor_str.py	add `torch.float4_e2m1fn_x2` to PyTorch (#148791 )	2025-03-27 17:32:20 +00:00
_tensor.py	Revert "Fix non-bitwise type annotations for Tensor operators (see #145838 ) (#146845 )"	2025-02-18 19:01:27 +00:00
_thread_safe_fork.py
_torch_docs.py	Optimize `torch.equal` description (#149618 )	2025-03-21 03:44:49 +00:00
_utils_internal.py	[ROCm] OCP FP8 Support for new GPUs (#146632 )	2025-02-24 22:47:52 +00:00
_utils.py	Allow torch.load under FakeTensorMode to load FakeTensors with correct devices (for plain Tensors) (#147786 )	2025-03-06 12:04:32 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py	Add sparse tensors constructed via legacy constructor to _sparse_tensors_to_validate (#147759 )	2025-02-25 23:51:12 +00:00
CMakeLists.txt	Add new dependences for gen_pyi.py (#150391 )	2025-04-03 14:18:18 +00:00
custom_class_detail.h
custom_class.h	Remove unneeded Clang-tidy suppression (#148246 )	2025-03-01 16:51:54 +00:00
extension.h
functional.py	Fix invalid nested int guarding in broadcast_shapes() (#145957 )	2025-03-11 00:53:13 +00:00
hub.py	[BE][CI][Easy] bump `ruff` to 0.9.0: long statements in docstrings (#146509 )	2025-02-24 19:56:08 +00:00
library.h	[pytorch] add experimental TORCH_LIBRARY_THREAD_UNSAFE_LAZY_INIT (#150537 )	2025-04-03 22:36:17 +00:00
library.py	[Docs] Make `torch.Library`'s `kind` have no default value to be consistent with the code (#149390 )	2025-03-21 04:42:10 +00:00
overrides.py	Use Python 3.9 typing (#148157 )	2025-03-04 03:09:55 +00:00
py.typed
quasirandom.py
random.py
README.md	Rename README.txt to README.md (#149811 )	2025-03-24 22:33:33 +00:00
return_types.py
script.h
serialization.py	Move get accelerator to use build time flags when possible (#146098 )	2025-03-10 13:17:58 +00:00
storage.py	add `torch.float4_e2m1fn_x2` to PyTorch (#148791 )	2025-03-27 17:32:20 +00:00
torch_version.py
types.py
version.py.tpl

README.md

Note [TH abstraction violation]


TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.