pytorch/torch/distributed
Wanchao Liang a26480a4d1 [dtensor] move early return check into redistribute autograd function (#121653)
This PR fixed the bug of redistribute to move early return check into the
redistribute autograd function, so that even though we redistribute the
same placement, the grad_placements from the `to_local` call might be
different, the redistribute backward still need to happen

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121653
Approved by: https://github.com/awgu
2024-03-12 17:37:30 +00:00
..
_composable [FSDP2] Zeroed padded tensor in _apply (#121509) 2024-03-08 22:31:19 +00:00
_shard [dist][sharded_tensor] Fix ChunkShardingSpec metadata offsets for empty shards (#121002) 2024-03-02 08:58:48 +00:00
_sharded_tensor
_sharding_spec
_spmd Fix ouput typos (#120870) 2024-02-29 08:29:14 +00:00
_tensor [dtensor] move early return check into redistribute autograd function (#121653) 2024-03-12 17:37:30 +00:00
_tools fix: docstring error in torch/distributed module (#113241) 2023-11-09 19:10:20 +00:00
algorithms [DDP] Use compiled_autograd to trace DDP backward allreduce (#110662) 2024-02-08 03:03:15 +00:00
autograd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308) 2023-02-07 21:10:56 +00:00
benchmarks Enable possibly-undefined error code (#118533) 2024-01-30 21:07:01 +00:00
checkpoint Revert "[DCP] Makes fsspec public (#121508)" 2024-03-12 17:02:43 +00:00
elastic [Torch Elastic][Draft] Refactor SubprocessHandler to separate module for easier subclass (#120373) 2024-03-08 01:37:34 +00:00
examples Fix typos under torch/distributed directory (#95638) 2023-03-27 21:13:44 +00:00
fsdp [FSDP][StateDict] Allow FULL_STATE_DICT option for 2D (#120837) 2024-03-05 10:03:44 +00:00
launcher [Torchelastic][Logging] Pluggable logsspecs using python entrypoints and option to specify one by name. (#120942) 2024-03-02 08:07:52 +00:00
nn Fix get_rank under a non-default group. (#120481) 2024-03-11 05:40:54 +00:00
optim [mta] Fused SGD (#116585) 2024-01-16 23:54:38 +00:00
pipeline [c10d] Deprecate torch.distributed.pipeline (#121464) 2024-03-08 19:55:02 +00:00
rpc [BE]: Use iterable.chain.from_iterable where possible (#116376) 2023-12-27 19:20:07 +00:00
tensor [dtensor] move early return check into redistribute autograd function (#121653) 2024-03-12 17:37:30 +00:00
__init__.py Fix torch.distributed.breakpoint (#115705) 2023-12-13 20:33:56 +00:00
_composable_state.py Fix docstring errors in _composable_state.py, remote_device.py, value_ranges.py, utils.py, run.py, rendezvous.py, launch.py, argparse_util.py, __init__.py, _cycles.py (#112953) 2023-11-08 01:13:09 +00:00
_functional_collectives_impl.py [functional collecitve] don't import torchdynamo when running torchdeploy (#120900) 2024-02-29 19:20:54 +00:00
_functional_collectives.py [dynamo] support rewriting dist.all_reduce with explicitly specified reduce op (#120181) 2024-03-09 08:28:22 +00:00
_state_dict_utils.py [DCP][state_dict] Let _offload_state_dict_to_cpu to return the companion_obj if it exist. (#121273) 2024-03-08 00:24:29 +00:00
argparse_util.py Add --local-ranks-filter to torchrun: allow logs filtering by rank (#118562) 2024-02-07 04:29:54 +00:00
c10d_logger.py Re-enable type checking for distributed_c10d.py (#115223) 2023-12-09 11:07:54 +00:00
collective_utils.py [Reland] Update mypy to 1.4.1 (#105227) 2023-07-15 20:30:20 +00:00
constants.py Switch env variable use in test harnesses to the non-deprecated names to fix warnings (#114880) 2023-12-01 20:08:23 +00:00
CONTRIBUTING.md
device_mesh.py [DeviceMesh] Add support for nD slicing (#119752) 2024-03-10 00:16:37 +00:00
distributed_c10d.py [c10d] Add complex support for P2P (#121240) 2024-03-08 22:47:49 +00:00
launch.py Fix docstring errors in _composable_state.py, remote_device.py, value_ranges.py, utils.py, run.py, rendezvous.py, launch.py, argparse_util.py, __init__.py, _cycles.py (#112953) 2023-11-08 01:13:09 +00:00
logging_handlers.py
remote_device.py Fix docstring errors in _composable_state.py, remote_device.py, value_ranges.py, utils.py, run.py, rendezvous.py, launch.py, argparse_util.py, __init__.py, _cycles.py (#112953) 2023-11-08 01:13:09 +00:00
rendezvous.py Enable local_partial_types (#118467) 2024-01-28 13:38:22 +00:00
run.py [Torchelastic][Logging] Pluggable logsspecs using python entrypoints and option to specify one by name. (#120942) 2024-03-02 08:07:52 +00:00
utils.py [fsdp][torch.compile] FSDP changes (#115497) 2023-12-19 18:44:36 +00:00