..
_shard
Revert D34284271: [TLC][checkpoint] Add unit test for StatefulComponentCheckpointAgent
2022-02-19 21:28:55 +00:00
_sharded_tensor
[reland] Create torch.distributed._shard package. ( #72141 )
2022-02-02 06:58:20 +00:00
_sharding_spec
[reland] Create torch.distributed._shard package. ( #72141 )
2022-02-02 06:58:20 +00:00
algorithms
[Join][BE] Fix typo; remove obsolete method ( #72886 )
2022-02-16 15:03:09 +00:00
autograd
Add Python declaration of torch._C and torch._C._autograd modules. ( #46622 )
2020-11-06 01:25:47 -08:00
benchmarks
Add lint for unqualified type: ignore ( #56290 )
2021-04-21 08:07:23 -07:00
elastic
[codemod][type-comments] Convert type comments in api.py ( #73084 )
2022-02-19 00:31:45 +00:00
fsdp
[FSDP][Reland] Implement local_state_dict and load_local_state_dict
2022-02-23 07:57:34 -08:00
launcher
(torch/elastic) fix scale down bug caused by calling rdzv_handler.shutdown() on premature agent failures ( #67749 )
2021-11-05 12:18:46 -07:00
nn
Revert D33716716: [pytorch][PR] Added remove_duplicate parameter to nn.Module
2022-02-03 09:04:29 +00:00
optim
[ZeRO] (Reland) Add ctor support for multiple param groups ( #72932 )
2022-02-22 16:29:55 +00:00
pipeline
Remove dtype from torch.Storage and use only torch.ByteStorage ( #62030 )
2021-10-05 13:50:34 -07:00
rpc
[distributed] Make rref_proxy._invoke_rpc trully async when needed. ( #70206 )
2022-01-19 23:37:15 +00:00
__init__.py
Add pybind trampoline for ProcessGroup and Work ( #66338 )
2021-10-11 06:41:06 -07:00
argparse_util.py
[19/n][torch/elastic][upstream] Replace pytorch.distributed.launch with torchelastic launcher ( #56214 )
2021-04-16 13:38:23 -07:00
constants.py
make ProcessGroupDefaultTimeout the same as python ( #56549 )
2021-04-21 17:56:05 -07:00
CONTRIBUTING.md
Update distributed contributing guide to show how to run one test in test_distributed_spawn ( #67801 )
2021-11-04 08:54:31 -07:00
distributed_c10d.py
Stop writing logs to root logger ( #72649 )
2022-02-11 21:30:53 +00:00
launch.py
Introduce the torchrun entrypoint ( #64049 )
2021-08-26 20:17:48 -07:00
remote_device.py
Basic implementation of ShardedLinear using ShardedTensor. ( #64128 )
2021-09-20 18:31:11 -07:00
rendezvous.py
Update _create_c10d_store to check port value ( #71863 )
2022-01-26 22:29:33 +00:00
run.py
(torch/elastic) add fqdn hostname to error printout ( #66182 )
2021-10-07 01:40:02 -07:00