Commit Graph

17 Commits

Author SHA1 Message Date
Aaron Orenstein
99dbc5b0e2 PEP585 update - test (#145176)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145176
Approved by: https://github.com/bobrenjc93
2025-01-22 04:48:28 +00:00
Andrew Gu
00c18c8882 Make all-reduce input contiguous in distributed.nn.all_reduce (#144267)
Fixes https://github.com/pytorch/pytorch/issues/144060

I confirmed that the unit test fails without the `.contiguous()` fix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144267
Approved by: https://github.com/wz337, https://github.com/Skylion007, https://github.com/fduwjj
2025-01-06 22:20:04 +00:00
Xuehai Pan
db3290846e [BE][Easy][10/19] enforce style for empty lines in import segments in test/d*/ (#129761)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129761
Approved by: https://github.com/fegin
2024-07-17 16:57:39 +00:00
Xuehai Pan
26f4f10ac8 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
2024-05-27 14:49:57 +00:00
PyTorch MergeBot
55c0ab2887 Revert "[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)"
This reverts commit 7763c83af6.

Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))
2024-05-27 09:22:08 +00:00
Xuehai Pan
7763c83af6 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
ghstack dependencies: #127122, #127123, #127124, #127125
2024-05-27 04:22:18 +00:00
Yuanhao Ji
e3effa5855 Enable UFMT on all of test/distributed (#123539)
Partially addresses #123062

Ran lintrunner on:

- `test/distributed`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123539
Approved by: https://github.com/ezyang
2024-04-17 06:46:02 +00:00
PyTorch MergeBot
52be63eb2c Revert "Enable UFMT on all of test/distributed (#123539)"
This reverts commit 89ac37fe91.

Reverted https://github.com/pytorch/pytorch/pull/123539 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/123539#issuecomment-2058329471))
2024-04-16 06:33:21 +00:00
Yuanhao Ji
89ac37fe91 Enable UFMT on all of test/distributed (#123539)
Partially addresses #123062

Ran lintrunner on:

- `test/distributed`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123539
Approved by: https://github.com/ezyang
2024-04-16 03:23:56 +00:00
Sergii Dymchenko
35bf5bac26 Fix "sandcastle_skip_if decorator name is confusing" (#95649)
Fixes https://github.com/pytorch/pytorch/issues/89473
See the issue https://github.com/pytorch/pytorch/issues/89473

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95649
Approved by: https://github.com/atalman, https://github.com/malfet
2023-03-03 09:29:40 +00:00
pritam
500fb24715 Ensure tensors are contiguous in functional all_gather.
We called `tensor.contiguous()` in the forward pass, however this was
after the `out_tensor_list` was built which results in the `out_tensor_list`
containing non-contiguous tensors resulting in errors.

Fixing this by moving the contiguous call above.

Differential Revision: [D37222870](https://our.internmc.facebook.com/intern/diff/D37222870/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79747

Approved by: https://github.com/fduwjj, https://github.com/wanchaol
2022-06-17 01:27:11 +00:00
pritam
44aa4ad894 Use _all_gather_base and fuse matmul for sharded linear.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78477

Use `_all_gather_base` instead of all_gather for col-wise sharding
since `_all_gather_base` returns a single fused tensor that can be used to
perform a single matmul instead of looping through and performing multiple
matmuls.

This improves performance for col-wise sharding.

Differential Revision: [D36754385](https://our.internmc.facebook.com/intern/diff/D36754385/)

Approved by: https://github.com/aazzolini, https://github.com/wanchaol
2022-06-01 17:17:34 +00:00
Junjie Wang
7c2489bdae [PyTorch][Distributed] Enable Reduce Scatter and modify all_to_all for sharded linear with more test cases. (#68786)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68786

To enable the auto grad for the sharded linear, we find we need to make some changes to the current nn function api (c10d api with auto grad enabled). So we made the following several changes:

1. Add a new api `reduce_scatter` since we need it in the rowwise sharding.
2. Modify the `all_to_all` api to make sure it consistent with the ones in distributed_c10d.py.
3. Found the cpp input params of `reduce_scatter` is missing input param, added more unit test to cover these cases.
4. Sync the NN test from gloo to nccl.
ghstack-source-id: 144860208

Test Plan: CI + Unit Test

Reviewed By: pritamdamania87

Differential Revision: D32569674

fbshipit-source-id: 9bd613f91bbf7a39eede0af32a5a5db0f2ade43b
2021-12-06 13:38:58 -08:00
Jane Xu
34051d74da Add test owner to distributed files starting with test_ (#66797)
Summary:
Action based on https://github.com/pytorch/pytorch/issues/66232

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66797

Reviewed By: gchanan

Differential Revision: D31761389

Pulled By: janeyx99

fbshipit-source-id: c27c9ab4acec1eb71d5edd4538cd113b770dfc6c
2021-10-19 10:55:20 -07:00
Pritam Damania
82d81455ae [2/N] Remove unittest.skip across all of torch.distributed. (#61887)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61887

1) Introduced a `sandcastle_skip_if` decorator that ensures these
tests just get passed on sandcastle.
2) Fixed all test files under `test/distributed` to not use `unittest.skip`

Overall goal is to avoid using skips since sandcastle tags these tests as
continuously skipping.
ghstack-source-id: 134382237

Test Plan: waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D29784152

fbshipit-source-id: 17b4df6c5a55ff1d1e8e1de128fa679c3dfbcb7d
2021-07-27 10:53:23 -07:00
Nikita Shulga
b587354e4c Add Python-3.9 CI testing (#50992)
Summary:
Skip number of tests adjust typing handling

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50992

Reviewed By: walterddr

Differential Revision: D26170388

Pulled By: malfet

fbshipit-source-id: 47852512aa3d5c25faf6687bcd0b1cbb332b0b20
2021-05-10 10:51:39 -07:00
Pavel Belevich
426852b4f0 Split test_c10d_spawn.py to test_c10d_spawn_gloo.py,test_c10d_spawn_nccl.py (#56599)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56599

Test Plan: NA

Reviewed By: SciPioneer

Differential Revision: D27913955

fbshipit-source-id: 7206e589fb7d08c55d08a58a3d57dc3d210a795e
2021-04-21 22:11:49 -07:00