pytorch/torch/distributed
Ankita George 6a658d983e Build a storage reader/writer to write checkpoints in HF format (#147622)
Title - we want to write checkpoints in HF format with DCP, this diff allows this for the non-distributed use case.
Copy of [D68444967](https://www.internalfb.com/diff/D68444967) (https://github.com/pytorch/pytorch/pull/146352). That diff got reverted because of lint errors. The lint error was due to having imports of uninstalled libraries. This was on purpose because we don't want to install safetensors and huggingface, this new diff explicitly ignores this lint so that we don't have the error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147622
Approved by: https://github.com/saumishr
2025-02-26 20:47:54 +00:00
..
_composable type fully_shard so that the return value can be chained with typing enabled (#147489) 2025-02-20 08:43:16 +00:00
_shard PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
_sharded_tensor
_sharding_spec
_symmetric_memory PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
_tensor
_tools Turn on mypy for _dynamo/variables/builtin.py (#145552) 2025-01-30 22:21:32 +00:00
algorithms Revert "Fix non-bitwise type annotations for Tensor operators (see #145838) (#146845)" 2025-02-18 19:01:27 +00:00
autograd
benchmarks [BE][CI] bump ruff to 0.8.4 (#143753) 2024-12-24 12:24:10 +00:00
checkpoint Build a storage reader/writer to write checkpoints in HF format (#147622) 2025-02-26 20:47:54 +00:00
elastic PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
examples
fsdp Fix bug in FSDP wrapped module with zero argument (#147771) 2025-02-26 01:40:53 +00:00
launcher PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
nn PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
optim [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408) 2025-02-04 19:07:04 +00:00
pipelining [PP] Remove extra code and docs BE (#147636) 2025-02-22 00:10:31 +00:00
rpc PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
tensor [DTensor][random] defer DTensor RNG state sync until first random op call or manual_seed call; support more flexible OffsetBasedRNGTracker init (#147025) 2025-02-26 17:33:22 +00:00
__init__.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
_checkpointable.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
_composable_state.py
_functional_collectives_impl.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
_functional_collectives.py more dist ops in non strict (#147417) 2025-02-19 21:29:16 +00:00
_serialization.py PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
_state_dict_utils.py Let _create_cpu_state_dict and _copy_state_dict support DTensor (#146852) 2025-02-12 18:43:52 +00:00
argparse_util.py
c10d_logger.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
collective_utils.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
constants.py
CONTRIBUTING.md
device_mesh.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
distributed_c10d.py Enable ruff rule S324 (#147665) 2025-02-25 18:27:34 +00:00
launch.py
logging_handlers.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
remote_device.py
rendezvous.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00
run.py Improve torchrun documentation (#144354) 2025-01-24 20:40:05 +00:00
utils.py PEP585 update - torch/distributed (#145164) 2025-01-21 04:23:29 +00:00