pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Ke Wen daed3bf8f9 Implement coalesced all_gather_into_tensor (#101157 ) This PR adds support for the following use cases: - Sync style: ``` with dist._coalescing_manager(): for i in range(num_coll): dist.all_gather_into_tensor(output_tensors[i], input_tensors[i]) ``` - Async style: ``` with dist._coalescing_manager(async_ops=True) as cm: for i in range(num_coll): dist.all_gather_into_tensor(output_tensors[i], input_tensors[i]) # do a bunch of other things cm.wait() # do things that depend on the all-gather's ``` Each `all_gather_into_tensor` would be independent in terms of data and their buffer location. But could be executed in parallel by supported backends (like NCCL). Pull Request resolved: https://github.com/pytorch/pytorch/pull/101157 Approved by: https://github.com/kumpera, https://github.com/wanchaol		2023-05-11 20:58:47 +00:00
..
_composable	[replicate] support simpler device_id (#100217 )	2023-05-04 21:06:04 +00:00
_shard	[distributed][sharded_tensor] Move local_shards check from ShardedTensorBase to ShardedTensor (#100197 )	2023-05-02 12:42:24 +00:00
_sharded_tensor
_sharding_spec
_spmd	handle new param from torch.compile (Inductor pattern matcher), enable_log (#100814 )	2023-05-08 18:34:45 +00:00
_tensor	[dtensor] tensor ops to use strategy based sharding prop (#100607 )	2023-05-11 02:47:20 +00:00
_tools	Fix typos under torch/distributed directory (#95638 )	2023-03-27 21:13:44 +00:00
algorithms	Properly propagates checkpoint wrapper args and kwargs (#99791 )	2023-05-03 23:19:21 +00:00
autograd
benchmarks
checkpoint	[BE] Fix flake8 B027 errors - missing abstractmethod decorator (#100715 )	2023-05-09 17:28:48 +00:00
elastic	[BE] Fix flake8 B027 errors - missing abstractmethod decorator (#100715 )	2023-05-09 17:28:48 +00:00
examples	Fix typos under torch/distributed directory (#95638 )	2023-03-27 21:13:44 +00:00
fsdp	[FSDP][state_dict] Make sharded_state_dict work with composable fully_shard (#100856 )	2023-05-10 15:32:45 +00:00
launcher	Convert logging f-strings to use % format, part four (#98705 )	2023-04-11 13:17:59 +00:00
nn	Convert logging f-strings to use % format (#98697 )	2023-04-10 12:19:31 +00:00
optim	Convert logging f-strings to use % format, part four (#98705 )	2023-04-11 13:17:59 +00:00
pipeline	Enable ruff in lintrunner (#99785 )	2023-04-24 16:18:44 +00:00
rpc	[BE] Fix all B022 `useless-contextlib-suppress` (#100335 )	2023-04-30 18:47:40 +00:00
tensor	[dtensor] tensor ops to use strategy based sharding prop (#100607 )	2023-05-11 02:47:20 +00:00
__init__.py	[c10d] Faster coalescing (#98793 )	2023-04-24 21:27:26 +00:00
_composable_state.py
_functional_collectives.py	Work around torchdynamo import error with functional collectives (#100901 )	2023-05-09 16:09:42 +00:00
argparse_util.py
c10d_error_logger.py
constants.py
CONTRIBUTING.md
distributed_c10d.py	Implement coalesced all_gather_into_tensor (#101157 )	2023-05-11 20:58:47 +00:00
launch.py	Fix typos under torch/distributed directory (#95638 )	2023-03-27 21:13:44 +00:00
logging_handlers.py
remote_device.py
rendezvous.py	Revisit `torch._six.string_classes` removal (#94709 ) (#97863 )	2023-03-30 17:02:45 +00:00
run.py	Convert logging f-strings to use % format, part four (#98705 )	2023-04-11 13:17:59 +00:00
utils.py	[PyTorch/Distributed]Only sync buffers when broadcast_buffers is True (#100729 )	2023-05-08 16:34:29 +00:00