pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Wanchao Liang	f026b32008	[device_mesh][BE] reduce_scatter fallback to funcol and remove from DM (#105642 ) For the reason similar to https://github.com/pytorch/pytorch/pull/105605 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105642 Approved by: https://github.com/kumpera, https://github.com/wz337, https://github.com/fduwjj	2023-07-27 01:33:05 +00:00
Wanchao Liang	2fa063e1e0	[device_mesh][BE] remove allgather from DM (#105614 ) For the reason similar to https://github.com/pytorch/pytorch/pull/105605 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105614 Approved by: https://github.com/rohan-varma, https://github.com/wz337, https://github.com/fduwjj	2023-07-27 01:33:05 +00:00
Wanchao Liang	8b94280008	[functional collective] parameterize allreduce tests (#105604 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105604 Approved by: https://github.com/rohan-varma	2023-07-24 22:21:19 +00:00
Rodrigo Kumpera	17ab4f85e9	[c10d] Adopt allgather_into_tensor_coalesced for NCCL. (#103086 ) This is done by adding c10d::_allgather_into_tensor_coalesced wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103086 Approved by: https://github.com/rohan-varma	2023-07-06 15:05:55 +00:00
Rodrigo Kumpera	c17bdb3247	[C10D] Add functional collective reduce_scatter_into_tensor_coalesced. (#101023 ) Implementation uses a fallback that does no coalescing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101023 Approved by: https://github.com/wanchaol	2023-06-23 19:24:11 +00:00
Rodrigo Kumpera	63fe26809d	Implement all_gather_into_tensor_coalesced. (#98642 ) The implementation is suboptimal since it uses c10d's group coalescing which is known to be inneficient. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98642 Approved by: https://github.com/wanchaol	2023-06-13 15:06:52 +00:00
Rodrigo Kumpera	5b4a523583	Add all_reduce_coalesced to functional collectives (#98640 ) This adds all_reduce_coalesced to MTPG to ease testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98640 Approved by: https://github.com/wanchaol	2023-04-26 17:05:54 +00:00
PyTorch MergeBot	e778bcec05	Revert "fix allgather func collective to use maybe_wrap_tensor (#98866 )" This reverts commit `ada7dfff71`. Reverted https://github.com/pytorch/pytorch/pull/98866 on behalf of https://github.com/izaitsevfb due to Conflicts with the co-dev diff D44921259, reverting to unblock the diff train	2023-04-14 00:30:16 +00:00
Wanchao Liang	ada7dfff71	fix allgather func collective to use maybe_wrap_tensor (#98866 ) It looks like we forgot to switch allgather to use maybe_wrap_tensor, this PR switch to use that and added test to guard tracing behavior Pull Request resolved: https://github.com/pytorch/pytorch/pull/98866 Approved by: https://github.com/mrshenli	2023-04-12 19:13:46 +00:00
PyTorch MergeBot	fa08e546f3	Revert "Add all_reduce_coalesced functional collective (#97157 )" This reverts commit `a3fc3531f5`. Reverted https://github.com/pytorch/pytorch/pull/97157 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it seems to have a land race with https://github.com/pytorch/pytorch/pull/96226 and fails lint on trunk	2023-04-04 01:50:49 +00:00
Rodrigo Kumpera	a3fc3531f5	Add all_reduce_coalesced functional collective (#97157 ) Inductor codegen is suboptimal when calling all_reduce_coalesced with input args. We need to fix inductor's calling convention for that, or something else. Might not work if any outputs is unused. Test code: ```python import torch import torch.distributed as dist import torch.nn.functional as F from functorch import make_fx import os import torch.distributed._functional_collectives as ft_c from torch.testing._internal.common_distributed import ( spawn_threads_and_init_comms, ) from torch._inductor.compile_fx import compile_fx_inner def my_fun(a, b): c = a * 3 tensors = ft_c.all_reduce_coalesced([a, c, b], "sum", [0]) return ((tensors[1] + tensors[0] + tensors[2]).sum(), ) @spawn_threads_and_init_comms(world_size=1) def inductor_main(self): x = torch.arange(4).cuda() * (dist.get_rank() + 1) y = torch.arange(4).cuda() * (dist.get_rank() + 1) x = x.to(torch.float) y = y.to(torch.float) * 0.5 res = make_fx(my_fun)(x, y) print(f"fx graph:\n{res.graph}") ind = compile_fx_inner(res, [x, y]) print(f"inductor done:\n{ind}") os.environ["PROXY_TENSOR_TRACING"] = "1" os.environ["TORCH_COMPILE_DEBUG"] = "1" torch._dynamo.config.output_code = True if __name__ == "__main__": inductor_main(None) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/97157 Approved by: https://github.com/fegin	2023-04-04 01:13:18 +00:00
Wanchao Liang	848bf8103b	fix functional collective to not generate getattr node (#97924 ) use mesh.get_dim_groups directly instead of doing mesh tensor operations This help us get rid of the getattr ops during tracing Pull Request resolved: https://github.com/pytorch/pytorch/pull/97924 Approved by: https://github.com/kumpera	2023-03-30 20:14:50 +00:00
Rodrigo Kumpera	e22d791287	[PTD] Introduce tracing friendly collectives. (#93990 ) This change adds torch.distributed.traceable_collectives. This experimental API enables collectives to be fully traced by dynamo and FX. See #93173 for the RFC Pull Request resolved: https://github.com/pytorch/pytorch/pull/93990 Approved by: https://github.com/wconstab, https://github.com/wanchaol, https://github.com/H-Huang	2023-02-16 15:35:01 +00:00

13 Commits