pytorch/torch/distributed/checkpoint
Ankita George 85e4e51a7d Fix bug in _load_state_dict_from_keys method (#150058)
Summary:
The _load_state_dict_from_keys method specifies that `Loads any key specified in this set. If no keys are specified, the entire checkpoint is loaded.`
But this isn't happening right now, because an empty keys arg is passed in as a set() to `_load_state_dict` and keys is expected to be None for it to actually be included in the state_dict https://fburl.com/code/l8yzojyx. So with the set() argument, the state_dict is always going to be empty

Test Plan: ensure existing tests pass

Differential Revision: D71930712

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150058
Approved by: https://github.com/saumishr
2025-03-27 16:36:00 +00:00
..
examples [reland][dtensor] move DTensor to public namespace (#134203) 2024-09-08 17:08:40 +00:00
__init__.py Support huggingface reading and writing for multi rank case (#148189) 2025-03-26 14:47:31 +00:00
_async_executor.py [DCP] Introduce process based async checkpointing (#147039) 2025-03-04 13:33:28 +00:00
_async_process_executor.py [DCP] Introduce process based async checkpointing (#147039) 2025-03-04 13:33:28 +00:00
_async_thread_executor.py [DCP] Introduce process based async checkpointing (#147039) 2025-03-04 13:33:28 +00:00
_checkpointer.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
_dedup_save_plans.py Support huggingface reading and writing for multi rank case (#148189) 2025-03-26 14:47:31 +00:00
_dedup_tensors.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
_extension.py PEP585: Missed conversions (#145342) 2025-01-29 05:24:36 +00:00
_fsspec_filesystem.py Revert "Build a storage reader/writer to write checkpoints in HF format (#146352)" 2025-02-21 07:30:52 +00:00
_hf_planner.py Support huggingface reading and writing for multi rank case (#148189) 2025-03-26 14:47:31 +00:00
_hf_storage.py Support huggingface reading and writing for multi rank case (#148189) 2025-03-26 14:47:31 +00:00
_nested_dict.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
_sharded_tensor_utils.py [BE][Easy] enable UFMT for torch/distributed/{algorithms,autograd,benchmarks,checkpoint,elastic}/ (#128866) 2024-06-18 13:51:53 +00:00
_storage_utils.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
_traverse.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
_version.py [DCP] Fixes the BC issue where the traversal doesn't support versions before 2.4 (#134158) 2024-08-28 16:31:44 +00:00
api.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
default_planner.py Support huggingface reading and writing for multi rank case (#148189) 2025-03-26 14:47:31 +00:00
filesystem.py Build a storage reader/writer to write checkpoints in HF format (#148089) 2025-02-28 07:38:10 +00:00
format_utils.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
logger.py [DCP] Introduce process based async checkpointing (#147039) 2025-03-04 13:33:28 +00:00
logging_handlers.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
metadata.py [DCP] Introduce modules metadata in the storage_meta (#146654) 2025-02-13 17:44:30 +00:00
optimizer.py [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
planner_helpers.py [DCP] Cache save plan metadata to reduce the collective overhead (#149785) 2025-03-25 02:00:15 +00:00
planner.py [DCP] Cache save plan metadata to reduce the collective overhead (#149785) 2025-03-25 02:00:15 +00:00
resharding.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
staging.py Fix staging for CPU tensors in OSS DCP async_save (#145408) 2025-01-23 12:49:26 -08:00
state_dict_loader.py Fix bug in _load_state_dict_from_keys method (#150058) 2025-03-27 16:36:00 +00:00
state_dict_saver.py [DCP] Introduce process based async checkpointing (#147039) 2025-03-04 13:33:28 +00:00
state_dict.py [State_dict] Remove functools.cache and add unit test (#149354) 2025-03-18 17:30:41 +00:00
stateful.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
storage.py PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163) 2025-01-19 20:55:59 +00:00
utils.py [DCP] fix dcp gather_object/scatter_object_list (#147675) 2025-03-06 21:20:38 +00:00