pytorch/torch
Tristan Rice df4e5294a6 Reapply "ProcessGroupGloo: support lazy_init (#150801)" (#151031)
This reverts commit 73f3d6d9aa.

Reapplies #150801

Test plan:

See #150801

submodule

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151031
Approved by: https://github.com/fduwjj
2025-04-11 01:58:35 +00:00
..
_awaits
_C Reapply "ProcessGroupGloo: support lazy_init (#150801)" (#151031) 2025-04-11 01:58:35 +00:00
_C_flatbuffer
_custom_op
_decomp Fix addbmm & addmv & baddbmm out dtype check (#148176) 2025-04-09 07:02:56 +00:00
_dispatch
_dynamo Revert "Support tuning of _scaled_grouped_mm (#150421)" 2025-04-10 21:36:41 +00:00
_export Back out "[AOTI] Always use oss schema for ExternKernelNodes serialization" (#151026) 2025-04-10 22:36:35 +00:00
_functorch [pytorch] Remove numpy dependency from Knapsack Evaluator (#150825) 2025-04-10 17:07:03 +00:00
_higher_order_ops [hop] support base_hop._gen_schema (#149688) 2025-04-09 16:42:55 +00:00
_inductor Revert two recent prologue prs (#151013) 2025-04-10 23:48:41 +00:00
_lazy
_library Revert "Inductor respects exact strides on custom ops by default (#150511)" 2025-04-09 16:49:48 +00:00
_logging [export] Beef up guard_added logs (#149465) 2025-03-20 23:02:07 +00:00
_numpy
_prims
_prims_common
_refs Remove guard_size_oblivious from vector_norm decomposition. (#148809) 2025-04-10 16:19:00 +00:00
_strobelight
_subclasses Add real_tensor to the FakeTensor in node.meta["val"] (#150948) 2025-04-10 00:11:46 +00:00
_vendor
accelerator Delegate torch.accelerator.device_count to torch.xxx.device_count for multi-process usage (#149924) 2025-04-10 02:37:37 +00:00
amp [MPS] grad scaler (#150255) 2025-04-06 17:06:55 +00:00
ao Fix torchscript issues with reference quantized modules (#150870) 2025-04-10 20:14:45 +00:00
autograd Document poison fork note for accelerator APIs (#147507) 2025-04-10 02:37:37 +00:00
backends Expose is_available API for torch.backends.mkldnn (#147432) 2025-04-10 05:05:37 +00:00
compiler [dynamo] add reason field to torch.compiler.disable (#150341) 2025-04-02 04:26:48 +00:00
contrib
cpu
csrc Reapply "ProcessGroupGloo: support lazy_init (#150801)" (#151031) 2025-04-11 01:58:35 +00:00
cuda Delegate torch.accelerator.device_count to torch.xxx.device_count for multi-process usage (#149924) 2025-04-10 02:37:37 +00:00
distributed [C10D] Document object collectives limitations (#150815) 2025-04-10 22:48:39 +00:00
distributions [typing] Add type hints to __init__ methods in torch.distributions. (#144197) 2025-04-06 17:50:35 +00:00
export [BC-breaking] Set NonStrict as default for export_for_training (#150941) 2025-04-11 00:50:05 +00:00
fft
func
futures
fx Remove guard_size_oblivious from vector_norm decomposition. (#148809) 2025-04-10 16:19:00 +00:00
jit Fix torchscript issues with reference quantized modules (#150870) 2025-04-10 20:14:45 +00:00
legacy
lib Enable modernize-use-default-member-init (#149046) 2025-04-09 11:57:24 +00:00
linalg Implement gradient for the residuals of torch.linalg.lstsq (#148526) 2025-03-10 12:35:09 +00:00
masked Use variadic length tuple for torch.masked.DimOrDims (#149870) 2025-03-31 07:06:58 +00:00
monitor
mps [MPS] Make torch.mps.compile_shader public (#148972) 2025-03-11 20:20:58 +00:00
mtia [MTIA] Add _mtia_maybeExchangeDevice to MTIA module (#149340) 2025-03-18 15:15:12 +00:00
multiprocessing
nested [aotd] Guess tangents stride as output strides (#144579) 2025-03-20 15:41:36 +00:00
nn Do not depend on numpy during the import (#150816) 2025-04-08 18:12:53 +00:00
onnx [ONNX] Add asdict method to VerificationInfo class (#151024) 2025-04-10 22:23:33 +00:00
optim [MPS] grad scaler (#150255) 2025-04-06 17:06:55 +00:00
package
profiler [profiler] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#150957) 2025-04-10 17:45:01 +00:00
quantization
signal
sparse Fix spelling (#149277) 2025-03-20 01:02:32 +00:00
special
testing Reapply "ProcessGroupGloo: support lazy_init (#150801)" (#151031) 2025-04-11 01:58:35 +00:00
utils [Inductor] Remove triton dtype patch which has landed (#149611) 2025-04-10 03:42:55 +00:00
xpu
__config__.py
__future__.py
__init__.py [profiler] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#150957) 2025-04-10 17:45:01 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py [invoke_subgraph] Lazy backward (#150666) 2025-04-07 22:44:43 +00:00
_jit_internal.py
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py [Quant][PT2E][X86] enable qconv1d-relu fusion (#150751) 2025-04-09 14:42:02 +00:00
_namedtensor_internals.py
_ops.py [hop] support base_hop._gen_schema (#149688) 2025-04-09 16:42:55 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py [Docs] Clarify behavior when integer dtype is used with requires_grad=True in tensor.to() (#150913) 2025-04-10 02:52:58 +00:00
_tensor_str.py add torch.float4_e2m1fn_x2 to PyTorch (#148791) 2025-03-27 17:32:20 +00:00
_tensor.py
_thread_safe_fork.py
_torch_docs.py Add torch.triu_indices, torch.tril_indices dtype description (#150749) 2025-04-09 15:03:24 +00:00
_utils_internal.py [profiler] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#150957) 2025-04-10 17:45:01 +00:00
_utils.py
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
CMakeLists.txt Add new dependences for gen_pyi.py (#150391) 2025-04-03 14:18:18 +00:00
custom_class_detail.h
custom_class.h
extension.h
functional.py Fix invalid nested int guarding in broadcast_shapes() (#145957) 2025-03-11 00:53:13 +00:00
hub.py
library.h [pytorch] add header docs for TORCH_LIBRARY_THREAD_UNSAFE_LAZY_INIT (#150854) 2025-04-09 12:59:24 +00:00
library.py Fix _del_library (#150495) 2025-04-09 02:09:18 +00:00
overrides.py
py.typed
quasirandom.py
random.py
README.md Rename README.txt to README.md (#149811) 2025-03-24 22:33:33 +00:00
return_types.py
script.h
serialization.py Move get accelerator to use build time flags when possible (#146098) 2025-03-10 13:17:58 +00:00
storage.py add torch.float4_e2m1fn_x2 to PyTorch (#148791) 2025-03-27 17:32:20 +00:00
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]


TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.