pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Will Constable	d64bada876	Refactor funcol for readability and dynamo tracing (#104387 ) Move eager kernel impls to separate file, which is eaiser to read (since users may be confused about 2 versions of each kernel in the same file) and easier to set a dynamo policy to trace only the first file currently. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104387 Approved by: https://github.com/wanchaol, https://github.com/fduwjj, https://github.com/kumpera	2023-07-06 23:29:49 +00:00
Rodrigo Kumpera	17ab4f85e9	[c10d] Adopt allgather_into_tensor_coalesced for NCCL. (#103086 ) This is done by adding c10d::_allgather_into_tensor_coalesced wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103086 Approved by: https://github.com/rohan-varma	2023-07-06 15:05:55 +00:00
Wanchao Liang	db1ac4e29b	fix functional collective's allgather for gloo (#104681 ) Summary: We should explicitly check for the gloo backend instead of relying on the shard's device, because user might pass a GPU tensor as input and a process group gloo as the pg, and expect that should work. Differential Revision: D47249172 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104681 Approved by: https://github.com/rohan-varma, https://github.com/fduwjj	2023-07-06 09:52:48 +00:00
Will Constable	d0509fe32d	Document how functional collectives work under eager/dynamo (#104386 ) Move user facing apis to the top for best visibility (strictly code-motion in this PR, besides adding comments) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104386 Approved by: https://github.com/voznesenskym, https://github.com/wanchaol	2023-06-30 01:12:55 +00:00
Rodrigo Kumpera	c17bdb3247	[C10D] Add functional collective reduce_scatter_into_tensor_coalesced. (#101023 ) Implementation uses a fallback that does no coalescing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101023 Approved by: https://github.com/wanchaol	2023-06-23 19:24:11 +00:00
Rodrigo Kumpera	0beec88c93	Inductor support for all_gather_into_tensor_coalesced. (#98643 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98643 Approved by: https://github.com/wanchaol	2023-06-21 19:25:03 +00:00
Rodrigo Kumpera	63fe26809d	Implement all_gather_into_tensor_coalesced. (#98642 ) The implementation is suboptimal since it uses c10d's group coalescing which is known to be inneficient. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98642 Approved by: https://github.com/wanchaol	2023-06-13 15:06:52 +00:00
PyTorch MergeBot	caecb55223	Revert "Log functional_collectives apis to distributed logger (#103288 )" This reverts commit `37359c36fd`. Reverted https://github.com/pytorch/pytorch/pull/103288 on behalf of https://github.com/malfet due to Broke test_inductor_collectives, see `37359c36fd` ([comment](https://github.com/pytorch/pytorch/pull/103288#issuecomment-1587677705))	2023-06-12 16:37:57 +00:00
Will Constable	37359c36fd	Log functional_collectives apis to distributed logger (#103288 ) This logs functional collectives API calls with debug log level only. (the `+` in the TORCH_LOGS cmdline enables debug level, otherwise only info level) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103288 Approved by: https://github.com/wanchaol, https://github.com/fduwjj	2023-06-12 06:33:26 +00:00
Wanchao Liang	d31707a257	Get rid of dim_groups attribute from DeviceMesh (#103105 ) This PR get rids of the dim_groups attribute from DeviceMesh, the main motivation behind this is that we should let c10d store the process groups during its creation instead of DeviceMesh, DeviceMesh should just handle ranks correctly. This could enable DTensor becomes picklable! (torch.save/load could be possible), which I will give it a try in the next PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/103105 Approved by: https://github.com/XilunWu, https://github.com/fduwjj	2023-06-09 04:11:15 +00:00
Will Constable	77f97019b7	Dynamo remaps legacy allgather to traceable one (#102232 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102232 Approved by: https://github.com/voznesenskym	2023-05-30 16:45:25 +00:00
albanD	59dff01319	Add top level function to check if running with deploy (#101420 ) Also not sure if this should be a public function or not. Leaving it private for now but let me know if you prefer for it to be public. FYI @nikitaved this will logically conflict with your triton kernel PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101420 Approved by: https://github.com/malfet	2023-05-16 16:05:49 +00:00
Will Constable	793bd6993a	Work around torchdynamo import error with functional collectives (#100901 ) Summary: Currently there are build configs where the torchdynamo import trips over a strange SystemError related to some module's __dict__.items() returning NULL, while torchdynamo tries to iterate all torch modules and process them for its allowed functions list. While this is hard to repro, we should be able to work around it and then fix it properly. Test Plan: Rely on others to test this, assuming CI passes. Reviewed By: anijain2305 Differential Revision: D45663313 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100901 Approved by: https://github.com/yanboliang, https://github.com/malfet	2023-05-09 16:09:42 +00:00
Rodrigo Kumpera	7a15e82388	Fix tensor registration to work with coalescing collectives. (#99763 ) We do it by making it possible to register multiple tensors for the same worker and coordinate waiting/cleanup among them. This ensures waiting on any number the output tensors will result in a single stream sync. This simplifies codegen by inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99763 Approved by: https://github.com/wanchaol	2023-05-05 14:25:35 +00:00
Will Constable	2dca418112	Reland basic dynamo support for traceable collectives (#100476 ) Relative to the original land, this also contains: - Fix torchdeploy import of functional collectives - Can't import torchdynamo utils due to torch._refs being missing Pull Request resolved: https://github.com/pytorch/pytorch/pull/100476 Approved by: https://github.com/kumpera	2023-05-04 04:25:35 +00:00
Shabab Ayub	287f74c4fc	Revert D45387167: Multisect successfully blamed D45387167 for test or build failures (#100424 ) Summary: This diff is reverting D45387167 D45387167: Basic dynamo support for traceable collectives (#94440) by wconstab has been identified to be causing the following test or build failures (internal) If you believe this diff has been generated in error you may Commandeer and Abandon it. Test Plan: NA Reviewed By: s4ayub Differential Revision: D45448312 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100424 Approved by: https://github.com/rohan-varma, https://github.com/kumpera	2023-05-03 16:10:54 +00:00
Will Constable	100a25d021	Basic dynamo support for traceable collectives (#94440 ) Make traceable collectives work with torchdynamo, bypassing problems with tracing the AsyncTensor subclass. Accept a suboptimal solution for now, and optimize it later. For now, wait happens immediately, which generally forces an early sync. Later, find a way either in dynamo or AOT stack to handle AsyncCollectiveTensor to get the wait in the optimal place. Note on implementation: - Dynamo traces 'user-level' fc apis that are designed to behave differently in eager vs compiled. In eager, there will be work-obj registration and a wrapper subclass will insert a 'wait' call at the appropriate time. In compile/trace mode, wait will be immetiately called, and work obj registration is required to be handled by the compile backend at runtime. - Dynamo needs to trace into some of the helper functions in the 'user-level' api, such as '_expand_group' which is essentially a constant transformation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94440 Approved by: https://github.com/kumpera	2023-04-27 05:38:36 +00:00
Rodrigo Kumpera	5b4a523583	Add all_reduce_coalesced to functional collectives (#98640 ) This adds all_reduce_coalesced to MTPG to ease testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98640 Approved by: https://github.com/wanchaol	2023-04-26 17:05:54 +00:00
Rodrigo Kumpera	38e964056b	Reland python ops (#99170 ) Waiting for the revert to land. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99170 Approved by: https://github.com/albanD	2023-04-18 15:15:46 +00:00
PyTorch MergeBot	1c042a2137	Revert "Reland python ops (#99170 )" This reverts commit `d4de64ae8d`. Reverted https://github.com/pytorch/pytorch/pull/99170 on behalf of https://github.com/DanilBaibak due to Break internal build	2023-04-18 11:37:43 +00:00
Rodrigo Kumpera	d4de64ae8d	Reland python ops (#99170 ) Waiting for the revert to land. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99170 Approved by: https://github.com/albanD	2023-04-17 21:53:41 +00:00
Rodrigo Kumpera	a910045add	[PATCH] Back out "Move functional collectives implementation to python. (#98595 ) (#99168 ) Summary: Original commit changeset: ba36f8751adc Original Phabricator Diff: D44788697 Test Plan: model loading is fine after reverting the diff Reviewed By: zyan0, sayitmemory Differential Revision: D44921259 --- Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/99168 Approved by: https://github.com/izaitsevfb	2023-04-14 23:48:19 +00:00
PyTorch MergeBot	e778bcec05	Revert "fix allgather func collective to use maybe_wrap_tensor (#98866 )" This reverts commit `ada7dfff71`. Reverted https://github.com/pytorch/pytorch/pull/98866 on behalf of https://github.com/izaitsevfb due to Conflicts with the co-dev diff D44921259, reverting to unblock the diff train	2023-04-14 00:30:16 +00:00
Wanchao Liang	ada7dfff71	fix allgather func collective to use maybe_wrap_tensor (#98866 ) It looks like we forgot to switch allgather to use maybe_wrap_tensor, this PR switch to use that and added test to guard tracing behavior Pull Request resolved: https://github.com/pytorch/pytorch/pull/98866 Approved by: https://github.com/mrshenli	2023-04-12 19:13:46 +00:00
Rodrigo Kumpera	24d9001527	Move functional collectives implementation to python. (#98595 ) This simplifies a lot the work we need to add new ops. This relands the previous PR, not sure why it was reverted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98595 Approved by: https://github.com/wconstab	2023-04-07 21:48:05 +00:00
PyTorch MergeBot	67d1a77086	Revert "Move functional collectives implementation to python. (#98315 )" This reverts commit `8b0374f83c`. Reverted https://github.com/pytorch/pytorch/pull/98315 on behalf of https://github.com/huydhn due to Sorry for reverting for PR. This is failing in trunk probably due to a landrace	2023-04-06 16:49:40 +00:00
Rodrigo Kumpera	8b0374f83c	Move functional collectives implementation to python. (#98315 ) This simplifies a lot the work we need to add new ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98315 Approved by: https://github.com/albanD, https://github.com/wconstab, https://github.com/Neilblaze	2023-04-06 14:06:16 +00:00
Kazuaki Ishizaki	6514d71add	Fix typos under torch/distributed directory (#98225 ) This PR fixes typos in comments and messages of `.py` files under `torch/distributed` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/98225 Approved by: https://github.com/soulitzer, https://github.com/kit1980	2023-04-05 00:21:33 +00:00
PyTorch MergeBot	fa08e546f3	Revert "Add all_reduce_coalesced functional collective (#97157 )" This reverts commit `a3fc3531f5`. Reverted https://github.com/pytorch/pytorch/pull/97157 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it seems to have a land race with https://github.com/pytorch/pytorch/pull/96226 and fails lint on trunk	2023-04-04 01:50:49 +00:00
Rodrigo Kumpera	a3fc3531f5	Add all_reduce_coalesced functional collective (#97157 ) Inductor codegen is suboptimal when calling all_reduce_coalesced with input args. We need to fix inductor's calling convention for that, or something else. Might not work if any outputs is unused. Test code: ```python import torch import torch.distributed as dist import torch.nn.functional as F from functorch import make_fx import os import torch.distributed._functional_collectives as ft_c from torch.testing._internal.common_distributed import ( spawn_threads_and_init_comms, ) from torch._inductor.compile_fx import compile_fx_inner def my_fun(a, b): c = a * 3 tensors = ft_c.all_reduce_coalesced([a, c, b], "sum", [0]) return ((tensors[1] + tensors[0] + tensors[2]).sum(), ) @spawn_threads_and_init_comms(world_size=1) def inductor_main(self): x = torch.arange(4).cuda() * (dist.get_rank() + 1) y = torch.arange(4).cuda() * (dist.get_rank() + 1) x = x.to(torch.float) y = y.to(torch.float) * 0.5 res = make_fx(my_fun)(x, y) print(f"fx graph:\n{res.graph}") ind = compile_fx_inner(res, [x, y]) print(f"inductor done:\n{ind}") os.environ["PROXY_TENSOR_TRACING"] = "1" os.environ["TORCH_COMPILE_DEBUG"] = "1" torch._dynamo.config.output_code = True if __name__ == "__main__": inductor_main(None) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/97157 Approved by: https://github.com/fegin	2023-04-04 01:13:18 +00:00
Rodrigo Kumpera	9ad66dd588	Switch reduce_scatter and all_gather in DeviceMesh to use functional collectives (#96226 ) Among the changes is the introduction of gather_dim and scatter_dim in DeviceMesh collectives to simplify user code. The current plan is to keep padding and gather/scatter dim support in DeviceMesh while we explore optimization opportunities in Inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96226 Approved by: https://github.com/wanchaol	2023-04-04 00:58:33 +00:00
Rodrigo Kumpera	3b188c5883	Don't use subclass when tracing and call wait_tensor immediately. (#98001 ) This change expects that proper scheduling of the wait_tensor call will happen over the traced graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98001 Approved by: https://github.com/wconstab, https://github.com/wanchaol	2023-03-31 18:33:20 +00:00
PyTorch MergeBot	f4f1a5b5b3	Revert "Move functional collectives to the right namespace (#97793 )" This reverts commit `184bfbc3d7`. Reverted https://github.com/pytorch/pytorch/pull/97793 on behalf of https://github.com/atalman due to breaks internal builds	2023-03-31 16:02:07 +00:00
Rodrigo Kumpera	184bfbc3d7	Move functional collectives to the right namespace (#97793 ) This moves them from `torch._C._nn` to `torch._C._dist` Pull Request resolved: https://github.com/pytorch/pytorch/pull/97793 Approved by: https://github.com/albanD	2023-03-30 22:18:13 +00:00
Wanchao Liang	848bf8103b	fix functional collective to not generate getattr node (#97924 ) use mesh.get_dim_groups directly instead of doing mesh tensor operations This help us get rid of the getattr ops during tracing Pull Request resolved: https://github.com/pytorch/pytorch/pull/97924 Approved by: https://github.com/kumpera	2023-03-30 20:14:50 +00:00
Kazuaki Ishizaki	35fd5c548e	Fix typos under torch/distributed directory (#95638 ) This PR fixes typos in comments and messages of `.py` files under torch/distributed directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/95638 Approved by: https://github.com/usamah1, https://github.com/H-Huang, https://github.com/kit1980	2023-03-27 21:13:44 +00:00
Rodrigo Kumpera	c7bd9b9490	Switch AsyncCollectiveTensor to be a wrapper subclass. (#96105 ) Our usage is of a wrapper, so it makes sense that we use one. This makes it possible for FakeTensorMode to work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96105 Approved by: https://github.com/wanchaol, https://github.com/wconstab	2023-03-10 15:13:32 +00:00
Rodrigo Kumpera	5b2ab0dd4f	Multiple fixes for functional collectives. (#95897 ) _functional_collectives.py: Ensure we always wait all collectives. derivatives.yaml: mark all_reduce as non differentiable gen_variable_type.py: Add all_reduce to DONT_ENFORCE_TENSOR_IMPL_USE_COUNT common_dtensor.py: replace dist.barrier with all_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/95897 Approved by: https://github.com/wconstab, https://github.com/fegin	2023-03-06 15:35:07 +00:00
Will Constable	92a2107375	Support Inductor collectives with wait or collective outside graph (#95893 ) Inductor implementations of collectives/wait must match eager impls in _functional_collectives in terms of interacting with _register_tensor_work API. If they do, then splitting a collective-wait pair so one half is in a compiled graph should work fine. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95893 Approved by: https://github.com/kumpera	2023-03-03 09:00:48 +00:00
Wanchao Liang	f397d1700f	Inductor reduce_scatter_tensor (#95764 ) This adds reduce_scatter to the functional collective and adds the inductor lowering support Pull Request resolved: https://github.com/pytorch/pytorch/pull/95764 Approved by: https://github.com/kumpera	2023-03-02 22:05:30 +00:00
Rodrigo Kumpera	3e8eedd78e	Round of fixes for functional collectives (#95714 ) Move collective registration to torch.__init__ to handle multipy warmup. Fix all_reduce with non-contiguous tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95714 Approved by: https://github.com/wconstab	2023-03-01 17:52:14 +00:00
Will Constable	cc6da7b901	Inductor allgather_into_tensor (#95530 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95530 Approved by: https://github.com/kumpera	2023-02-27 21:38:36 +00:00
PyTorch MergeBot	d950f45577	Revert "[Functional Collectives] Migrate DeviceMesh::all_reduce to use functional all_reduce. (#95009 )" This reverts commit `0765dbc25e`. Reverted https://github.com/pytorch/pytorch/pull/95009 on behalf of https://github.com/jeanschmidt due to this PR is causing internal breakages. Check https://fburl.com/diff/me41urq8	2023-02-27 19:21:58 +00:00
Rodrigo Kumpera	0765dbc25e	[Functional Collectives] Migrate DeviceMesh::all_reduce to use functional all_reduce. (#95009 ) BC: This changes the signature and semantics of DeviceMesh::all_reduce. DeviceMesh::all_reduce now uses a functional collective under the hood which makes it more easily traceable. You no longer need to use CommTensor to get a trace. all_reduce now is async only and uses AsyncCollectiveTensor to ensure proper stream synchronization. Signature changed: removed `async_op` param and changes return type from `Optional[Work]` to `torch.Tensor`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95009 Approved by: https://github.com/wanchaol	2023-02-24 02:10:55 +00:00
Rodrigo Kumpera	e22d791287	[PTD] Introduce tracing friendly collectives. (#93990 ) This change adds torch.distributed.traceable_collectives. This experimental API enables collectives to be fully traced by dynamo and FX. See #93173 for the RFC Pull Request resolved: https://github.com/pytorch/pytorch/pull/93990 Approved by: https://github.com/wconstab, https://github.com/wanchaol, https://github.com/H-Huang	2023-02-16 15:35:01 +00:00

45 Commits