pytorch/test/distributed
Wanchao Liang a26480a4d1 [dtensor] move early return check into redistribute autograd function (#121653)
This PR fixed the bug of redistribute to move early return check into the
redistribute autograd function, so that even though we redistribute the
same placement, the grad_placements from the `to_local` call might be
different, the redistribute backward still need to happen

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121653
Approved by: https://github.com/awgu
2024-03-12 17:37:30 +00:00
..
_composable [FSDP2][BE] Refactored check_1d_sharded_parity to use mesh (#121357) 2024-03-11 22:34:42 +00:00
_shard [dist][sharded_tensor] Fix ChunkShardingSpec metadata offsets for empty shards (#121002) 2024-03-02 08:58:48 +00:00
_spmd Update to TorchFix 0.4.0 (#119424) 2024-02-12 23:30:12 +00:00
_tensor [dtensor] move early return check into redistribute autograd function (#121653) 2024-03-12 17:37:30 +00:00
_tools
algorithms
bin
checkpoint Revert "[DCP] Makes fsspec public (#121508)" 2024-03-12 17:02:43 +00:00
elastic [Torchelasic] Create root log directory by default (#121257) 2024-03-06 18:50:38 +00:00
fsdp [TP] Introduce Sequence Parallel Style for Laynorm/RMSNorm/Dropout (#121295) 2024-03-07 02:04:59 +00:00
launcher [Torchelastic][Logging] Pluggable logsspecs using python entrypoints and option to specify one by name. (#120942) 2024-03-02 08:07:52 +00:00
nn/jit
optim
pipeline/sync Refactor some tests by using TEST_CUDA & TEST_MULTIGPU instead (#116083) 2024-01-03 08:53:59 +00:00
rpc [BE]: Enable F821 and fix bugs (#116579) 2024-01-01 08:40:46 +00:00
tensor/parallel [DTensor] Moved Transformer sharding to staticmethod (#121660) 2024-03-12 15:08:57 +00:00
argparse_util_test.py
test_c10d_common.py [c10d] Fix the hang issue in store.check(TIMEOUT_DUMP) (#116297) 2023-12-22 04:04:30 +00:00
test_c10d_functional_native.py Disable GroupRegistry's thread isolation by default (#121457) 2024-03-08 19:31:24 +00:00
test_c10d_gloo.py ProcessGroupGloo::reduce_scatter_tensor_coalesced (#118911) 2024-02-03 02:42:47 +00:00
test_c10d_logger.py
test_c10d_nccl.py Update error message (#121644) 2024-03-12 13:04:21 +00:00
test_c10d_object_collectives.py
test_c10d_pypg.py
test_c10d_spawn_gloo.py
test_c10d_spawn_nccl.py
test_c10d_spawn_ucc.py
test_c10d_spawn.py [BE]: Update flake8 to v6.1.0 and fix lints (#116591) 2024-01-03 06:04:44 +00:00
test_c10d_ucc.py Fix the skip condition for test_c10d tests (#119938) 2024-02-15 11:03:39 +00:00
test_collective_utils.py
test_compute_comm_reordering.py [reland] Fix estimate_nccl_collective_runtime (#118986) 2024-02-12 18:48:06 +00:00
test_data_parallel.py
test_device_mesh.py [DeviceMesh] Add support for nD slicing (#119752) 2024-03-10 00:16:37 +00:00
test_distributed_spawn.py Make test_distributed_spawn.py tell you how to run it correctly (#112924) 2023-11-04 02:43:43 +00:00
test_dynamo_distributed.py Pass inductor strides forward in ddp optimizer (#120523) 2024-02-29 22:25:00 +00:00
test_fake_pg.py further deprecate PairwiseParallel and SequenceParallel from test (#114402) 2023-11-30 05:06:08 +00:00
test_functional_api.py Change TestOpWaitiness to use MultiProcessTestCase (#121046) 2024-03-02 01:12:14 +00:00
test_inductor_collectives.py [dynamo] support rewriting dist.all_reduce with explicitly specified reduce op (#120181) 2024-03-09 08:28:22 +00:00
test_launcher.py
test_multi_threaded_pg.py [FSDP2] Used ReduceOp.AVG if fp32 reduce-scatter (#120919) 2024-03-02 00:39:16 +00:00
test_nccl.py [BE]: Enable F821 and fix bugs (#116579) 2024-01-01 08:40:46 +00:00
test_pg_wrapper.py Switch env variable use in test harnesses to the non-deprecated names to fix warnings (#114880) 2023-12-01 20:08:23 +00:00
test_store.py [c10d] Add a recursive method to get the inner most store (#117074) 2024-01-10 20:22:55 +00:00