Edward Z. Yang
ef3be6726f
Make distributed modules importable even when backend not built ( #159889 )
...
This PR is greatly simplified now that it stacked on top of a PR that builds with distributed always. We only need to stub functions that may not be defined due to a backend not being enabled.
Signed-off-by: Edward Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159889
Approved by: https://github.com/wconstab
ghstack dependencies: #160449
2025-09-04 20:05:50 +00:00
PyTorch MergeBot
34aa78274d
Revert "Make distributed modules importable even when backend not built ( #159889 )"
...
This reverts commit 4ae57d448c .
Reverted https://github.com/pytorch/pytorch/pull/159889 on behalf of https://github.com/jeanschmidt due to Failing internal tests, probably typechecks. See D81588399 ([comment](https://github.com/pytorch/pytorch/pull/159889#issuecomment-3253651785 ))
2025-09-04 13:13:52 +00:00
Edward Z. Yang
4ae57d448c
Make distributed modules importable even when backend not built ( #159889 )
...
This PR is greatly simplified now that it stacked on top of a PR that builds with distributed always. We only need to stub functions that may not be defined due to a backend not being enabled.
Signed-off-by: Edward Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159889
Approved by: https://github.com/wconstab
ghstack dependencies: #160449
2025-09-03 07:33:55 +00:00
PyTorch MergeBot
420c52ecf3
Revert "Make distributed modules importable even when backend not built ( #159889 )"
...
This reverts commit 626cb7df81 .
Reverted https://github.com/pytorch/pytorch/pull/159889 on behalf of https://github.com/jeanschmidt due to Breaking internal builds, can't be landed with forward fix due to internal tooling problems ([comment](https://github.com/pytorch/pytorch/pull/159889#issuecomment-3246677982 ))
2025-09-02 20:24:01 +00:00
Edward Z. Yang
626cb7df81
Make distributed modules importable even when backend not built ( #159889 )
...
This PR is greatly simplified now that it stacked on top of a PR that builds with distributed always. We only need to stub functions that may not be defined due to a backend not being enabled.
Signed-off-by: Edward Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159889
Approved by: https://github.com/wconstab
ghstack dependencies: #160449
2025-09-01 23:00:21 +00:00
Thomas Adams
8494d5582a
Propagate callable parameter types using ParamSpec ( #142306 ) ( #151014 )
...
Partially addresses #142306
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151014
Approved by: https://github.com/Skylion007
2025-04-13 20:38:11 +00:00
Xuehai Pan
995df34b19
[BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format ( #144547 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144547
Approved by: https://github.com/kwen2501
2025-02-28 07:35:56 +00:00
Aaron Orenstein
316808e4e9
PEP585 update - torch/distributed/elastic torch/distributed/checkpoint ( #145163 )
...
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145163
Approved by: https://github.com/Skylion007
2025-01-19 20:55:59 +00:00
Xuehai Pan
e6d4451ae8
[BE][Easy] enable UFMT for torch/distributed/{algorithms,autograd,benchmarks,checkpoint,elastic}/ ( #128866 )
...
Part of #123062
- #123062
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128866
Approved by: https://github.com/fegin
2024-06-18 13:51:53 +00:00
Tristan Rice
597922ba21
Reapply "distributed debug handlers ( #126601 )" ( #127805 )
...
This reverts commit 7646825c3e .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127805
Approved by: https://github.com/PaliC
2024-06-04 19:44:30 +00:00
PyTorch MergeBot
7646825c3e
Revert "distributed debug handlers ( #126601 )"
...
This reverts commit 3d541835d5 .
Reverted https://github.com/pytorch/pytorch/pull/126601 on behalf of https://github.com/PaliC due to breaking internal typechecking tests ([comment](https://github.com/pytorch/pytorch/pull/126601#issuecomment-2141076987 ))
2024-05-31 01:21:24 +00:00
Tristan Rice
3d541835d5
distributed debug handlers ( #126601 )
...
This adds debug handlers as described in:
* https://gist.github.com/d4l3k/828b7be585c7615e85b2c448b308d925 (public copy)
* https://docs.google.com/document/d/1la68szcS6wUYElUUX-P6zXgkPA8lnfzpagMTPys3aQ8/edit (internal copy)
This is only adding the C++ pieces that will be used from the main process. The Python and torchrun pieces will be added in a follow up PR.
This adds 2 handlers out of the box:
* `/handler/ping` for testing purposes
* `/handler/dump_nccl_trace_pickle` as a POC integration with Flight Recorder
Test plan:
```
python test/distributed/elastic/test_control_plane.py
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126601
Approved by: https://github.com/kurman , https://github.com/c-p-i-o
2024-05-30 02:21:08 +00:00