pytorch/torch/distributed/tensor/parallel
wz337 8140494afd [3/N][2D] Enable training with new 2D flow (#110034)
Replacing https://github.com/pytorch/pytorch/pull/109553 as it gets reverted.

This PR enables training with new 2D flow and adds associated test. In addition, this PR moves the tensor/parallel/_data_parallel_utils.py that are fsdp specific back to tensor/parallel/fsdp.py to avoid circular dependency for ddp.py and test/distributed/tensor/parallel/test_ddp_2d_parallel.py.

state_dict related changes would be in later PRs.

cc. @fegin, @fduwjj, @wanchaol, @awgu
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110034
Approved by: https://github.com/fduwjj
2023-09-26 09:14:15 +00:00
..
__init__.py Clean up unsed MHA code to avoid confusion (#105956) 2023-07-27 17:10:17 +00:00
_data_parallel_utils.py [3/N][2D] Enable training with new 2D flow (#110034) 2023-09-26 09:14:15 +00:00
_utils.py Get rid of dim_groups attribute from DeviceMesh (#103105) 2023-06-09 04:11:15 +00:00
_view_with_dim_change.py [TP][DTensor Perf] Some perf improvement to reduce DTensor CPU overhead (#106524) 2023-08-14 20:03:19 +00:00
api.py [dtensor][1/n] refactor op dispatch logic to reduce overhead (#107305) 2023-08-18 18:30:46 +00:00
ddp.py [2D][TP] Enable DDP TP integration with unit test (#106583) 2023-08-17 02:54:17 +00:00
fsdp.py [3/N][2D] Enable training with new 2D flow (#110034) 2023-09-26 09:14:15 +00:00
input_reshard.py [Reland] Update mypy to 1.4.1 (#105227) 2023-07-15 20:30:20 +00:00
style.py [TP][EZ] Update doc for TP parallel style (#107819) 2023-08-24 00:13:52 +00:00