pytorch/torch/distributed/algorithms
Andrew Gu 1c37119a1f [FSDP] New fix for composing with other module wrappers (#87950)
We change `.module` to pass through `ActivationWrapper` directly to the inner wrapped module. This should fix the state dict issues.

Given the invariant that `.module` always returns the inner wrapped module, FSDP always registers the `FlatParameter` on the inner wrapped module, regardless of if there is an intermediate `ActivationWrapper` or not. This avoids casing on whether `ActivationWrapper` is added before or after FSDP construction.

This PR removes the added unit test in `test_fsdp_misc.py` for changing the wrapped module because I would rather not complicated `_lazy_init()` logic just to support that kind of adversarial behavior. The user should not be swapping out the wrapped module arbitrarily or deleting the `FlatParameter`. I mainly had those tests to make sure that all branches of the code I added was correct.

Differential Revision: [D40799961](https://our.internmc.facebook.com/intern/diff/D40799961)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87950
Approved by: https://github.com/zhaojuanmao
2022-10-28 21:11:40 +00:00
..
_checkpoint [FSDP] New fix for composing with other module wrappers (#87950) 2022-10-28 21:11:40 +00:00
_comm_hooks [FSDP] Use reduce_scatter_tensor() (#87240) 2022-10-24 11:29:23 +00:00
_optimizer_overlap make fsdp folder to be public (#72084) 2022-02-02 15:50:14 +00:00
_quantization Change docstring type callable to Callable for consistency (#82487) 2022-08-01 17:26:09 +00:00
ddp_comm_hooks Add __all__ for a few distributed modules plus a little typing (reland) (#84872) 2022-09-13 21:57:49 +00:00
model_averaging Update hierarchical_model_averager.py (#85648) 2022-10-03 06:15:20 +00:00
__init__.py Make _Join, _Joinable, _JoinHook public (#62605) 2021-08-03 12:20:11 -07:00
join.py Integrate xdoctest - Rebased (#82797) 2022-08-12 02:08:01 +00:00