pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Chien-Chin Huang c4fc5d372f [FSDP][state_dict][1/N] Moving state_dict logic to pre_state_dict_hook (#87900 ) This is one step toward the ultimate goal: remove the overwritten state_dict in FSDP. All the logic should be either in `pre_state_dict_hook` or `post_state_dict_hook`. Since current `nn.Module` does not support `pre_state_dict_hook`, this PR mimic `pre_state_dict_hook` by calling the pre hook inside post the hook, effectively ditching all the work done by `nn.Module.state_dict`. Once `pre_state_dict_hook` is supported by `nn.Module`, these pre hook calls can be moved out from the post hooks and be registered to `nn.Module.pre_state_dict_hook`. The major issue of this temporary solution is that `post_state_dict_hook` is called from the leaf node to the root node. This makes the `module._lazy_init()` invalid as FSDP assumes `_lazy_init()` to be called from the root. As a result, `FSDP.state_dict` currently contains only one logic -- calling `module._lazy_init()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87900 Approved by: https://github.com/rohan-varma		2022-11-11 03:41:40 +00:00
..
_composable	[FSDP()][Easy] Make `fully_shard()` only `FULL_SHARD` (#88260 )	2022-11-03 13:41:54 +00:00
_shard	rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218 )	2022-11-10 14:51:13 +00:00
_sharded_tensor
_sharding_spec	Add __all__ for a few distributed modules plus a little typing (reland) (#84872 )	2022-09-13 21:57:49 +00:00
_spmd	Remove eager mode support form CommTensor (#84978 )	2022-09-14 17:23:23 +00:00
algorithms	Fix typos used in documents under torch directory (#88300 )	2022-11-02 09:38:13 +00:00
autograd	Integrate xdoctest - Rebased (#82797 )	2022-08-12 02:08:01 +00:00
benchmarks	Fix typo under torch directory (#87274 )	2022-10-21 14:22:20 +00:00
elastic	Make TorchElastic timer importable on Windows (#88522 )	2022-11-10 17:42:20 +00:00
fsdp	[FSDP][state_dict][1/N] Moving state_dict logic to pre_state_dict_hook (#87900 )	2022-11-11 03:41:40 +00:00
launcher
nn	[nn] add remove_duplicate flag to named_parameters (#759 ) (#88090 )	2022-11-09 00:09:20 +00:00
optim	Upstream apply_optim_in_backward from TorchRec (#87397 ) (#88539 )	2022-11-05 18:28:07 +00:00
pipeline	Deprecate TypedStorage, its derived classes, and all of their public methods (#85303 )	2022-11-08 18:11:01 +00:00
rpc	[Python] refactor slices on sorted (#86995 )	2022-10-25 04:07:19 +00:00
__init__.py	Add torch.distributed.DistBackendError exception type, thrown from C10D_NCCL_CHECK (#88134 )	2022-11-08 13:26:42 +00:00
argparse_util.py
c10d_error_logger.py	[C10D][BE] Add exception handlers to c10d collectives function (#87643 ) (#87988 )	2022-10-29 04:38:34 +00:00
constants.py
CONTRIBUTING.md
distributed_c10d.py	[14/N] Refactor _new_process_group_helper() to remove repeated code (#88351 )	2022-11-10 19:27:17 +00:00
launch.py	Integrate xdoctest - Rebased (#82797 )	2022-08-12 02:08:01 +00:00
logging_handlers.py	[C10D][BE] Add exception handlers to c10d collectives function (#87643 ) (#87988 )	2022-10-29 04:38:34 +00:00
remote_device.py
rendezvous.py
run.py	Integrate xdoctest - Rebased (#82797 )	2022-08-12 02:08:01 +00:00
utils.py	[DDP] Add `PackedSequence` support when `device_ids` is specified (#86614 )	2022-10-10 21:50:59 +00:00